SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Search Architecture at Evernote 
Not Your Typical Big Data Problem 
CHRISTIAN KOHLSCHÜTTER 
Sr. Search Researcher 
Augmented Intelligence @ Evernote
We are the workspace.
Write Collect Find Present
Find
Collect
Serving 100+ Million Users Worldwide 
• 559 Shards (200k users per shard), Linux/Tomcat/MySQL 
• 3.2 PB WebDAV-based Storage 
• 224 TB SSD capacity for System, MySQL and Lucene 
• 3.1 Billion Notes stored, 3.8 Bn Notes ever created 
• 115 Million Notes created or edited last week 
• 26 Million API calls to Context last week 
• 1 Lucene index per user
Evernote’s Three Laws of Data Protection 
• Your Data is Yours 
• Your Data is Protected 
• Your Data is Portable 
We are not a “big data” company and do not try to make 
money from your content.
Technical Debt 
• I/O over Lucene 2.9 indexes became a bottleneck 
• Code was woven into our “NoteStore” platform 
• Index changes had to be backwards-compatible 
• Complex re-indexing would require taking down a shard 
• Needed to rethink the entire architecture, but keep public API 
• Make search faster vs. Make us move faster
From Lucene 2.9 to 4.x and beyond 
• Large refactoring of search code 
• Lucene no longer is a direct dependency in “NoteStore” 
• Design-by-Contract 
• Can now run multiple Lucene versions concurrently in one VM 
• … and one specific version / schema per user 
• Migrated all users to Lucene 4.5, avg. downtime/user < 1 min
Separate the What from the How
Separation of Concerns 
UserIndex 
Manager 
UserIndex 
Factory 
NoteStore 
UserIndex 
Benchmarking UserIndex 
Lucene29 
UserIndexImpl 
Lucene4 
UserIndexImpl 
API 
Implementation 
Caching UserIndex 
...
Hide Lucene behind ClassLoaders 
• One Maven artifact per major Lucene version, 
build profiles for code-reuse between minor updates 
• Code is packaged with dependencies into one common fat-jar with prefixes for each 
implementation: 
- lucene29/org/apache/lucene/... 
lucene29/com/evernote/search/lucene2/… 
- lucene43/org/apache/lucene/... 
lucene43/com/evernote/search/lucene4/… 
- lucene45/org/apache/lucene/… 
lucene45/com/evernote/search/lucene4/… 
• ResourcePrefixClassLoader called from outside code strips prefix, 
uses fat-jar as the only dependency
New Index Structure 
• Each user’s index now comes with a properties file that 
describes its internal structures, such as index type and 
version. Can handle different behavior in code. 
• Changes to the index schema? Just increase the index version 
and handle the rest in code 
• Automatically trigger re-indexing if necessary
Index Auto-Migration 
• Target Default Index Implementation centrally set by DevOps 
• Triggered upon UserIndex access 
• UserIndex facade determines whether re-index is necessary 
• “Cruise Control” automates off-peak access 
# Threads
Phase 1: Migration to Lucene 4 
• Changes in Disk I/O (CPU correlates) 
overall: -81% 
searchRelatedNotes: -87% 
keyword-based search: -96% 
Saves TBs of I/O
Phase 2: Add Compression 
• User Indexes sizes and access patterns are skewed 
• Optimize large accounts 
• Directory-level compression 
• Compress segment files, invisible to the IndexReader 
• Only when re-indexing / every 3 months 
• In-memory Caching
LuceneTransform 
• https://code.google.com/p/lucenetransform by Mitja Lenič 
• We ported it to Lucene 4.5 (now available upstream for 4.9) 
• Improved LRU caching, added LZ4/Snappy compression 
• We will contribute our changes soon
OverlayDirectory 
on disk: 
_23.cfe 
_23.si 
c$_23.cfs 
segments.gen 
segments_2 
visible to IndexReader: 
_23.cfe 
_23.si 
_23.cfs 
segments.gen 
segments_2
Results 
• Compressed the largest 5% of all indexes using LZ4 
• 1.9 TB index space saved 
• 100 MB LRU Cache hit rate: 79% on avg (67% — 93%) 
• Saved 0.5 PB disk reads/week 
• Cache is so good, may use better/slower compression algorithm, 
may apply to more users 
Saves PBs of I/O
Bugs, Bugs, Bugs :-) 
• We’ve been warned 
“VInt bug” 
“background merge hit exception” 
JVM segfaults 
! 
• and then this happened, too 
SPI / ContextClassLoaders … LUCENE-4713 
Deadlocks / over-optimistic locking 
Unclosed resources / Too many open file handles => HousekeepingDirectory 
Issues with FieldCache singleton => LUCENE-831, LUCENE-2133, … 
… 
• UserIndex tracks “broken” state; allows self-healing (rebuild)
Conclusion 
• Design-By-Contract, Separation of Concerns 
• Per-user Search Implementation / Multiple Lucene versions 
• Migrated 60M users, without noticeable downtime 
• Migration allowed index changes, saves TBs of disk I/O 
• Block-level Index Compression, saves PBs of disk I/O 
• This is just the beginning.
Thank you 
christian@evernote.com
We’re hiring 
evernote.com/careers

Mais conteúdo relacionado

Mais procurados

Designing Payloads for Event-Driven Systems | Lorna Mitchell, Aiven
Designing Payloads for Event-Driven Systems | Lorna Mitchell, AivenDesigning Payloads for Event-Driven Systems | Lorna Mitchell, Aiven
Designing Payloads for Event-Driven Systems | Lorna Mitchell, AivenHostedbyConfluent
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksLucidworks
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Rahul Jain
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksDatabricks
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformLegacy Typesafe (now Lightbend)
 
Real Time search using Spark and Elasticsearch
Real Time search using Spark and ElasticsearchReal Time search using Spark and Elasticsearch
Real Time search using Spark and ElasticsearchSigmoid
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture Ramez Al-Fayez
 
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...Lucidworks
 
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Databricks
 
Solr At Scale For Time-Oriented Data: Presented by Brett Hoerner, Rocana
Solr At Scale For Time-Oriented Data: Presented by Brett Hoerner, RocanaSolr At Scale For Time-Oriented Data: Presented by Brett Hoerner, Rocana
Solr At Scale For Time-Oriented Data: Presented by Brett Hoerner, RocanaLucidworks
 
Custom management apps for Kafka
Custom management apps for KafkaCustom management apps for Kafka
Custom management apps for KafkaSotaro Kimura
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lightbend
 
Efficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out DatabasesEfficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out DatabasesJen Aman
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftChester Chen
 
Automated Spark Deployment With Declarative Infrastructure
Automated Spark Deployment With Declarative InfrastructureAutomated Spark Deployment With Declarative Infrastructure
Automated Spark Deployment With Declarative InfrastructureSpark Summit
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsLightbend
 
Apache Spark and Online Analytics
Apache Spark and Online Analytics Apache Spark and Online Analytics
Apache Spark and Online Analytics Databricks
 

Mais procurados (20)

Designing Payloads for Event-Driven Systems | Lorna Mitchell, Aiven
Designing Payloads for Event-Driven Systems | Lorna Mitchell, AivenDesigning Payloads for Event-Driven Systems | Lorna Mitchell, Aiven
Designing Payloads for Event-Driven Systems | Lorna Mitchell, Aiven
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
Kafka Connect
Kafka ConnectKafka Connect
Kafka Connect
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
 
Real Time search using Spark and Elasticsearch
Real Time search using Spark and ElasticsearchReal Time search using Spark and Elasticsearch
Real Time search using Spark and Elasticsearch
 
Twitter Search Architecture
Twitter Search Architecture Twitter Search Architecture
Twitter Search Architecture
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
Loading 350M documents into a large Solr cluster: Presented by Dion Olsthoorn...
 
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
 
Solr At Scale For Time-Oriented Data: Presented by Brett Hoerner, Rocana
Solr At Scale For Time-Oriented Data: Presented by Brett Hoerner, RocanaSolr At Scale For Time-Oriented Data: Presented by Brett Hoerner, Rocana
Solr At Scale For Time-Oriented Data: Presented by Brett Hoerner, Rocana
 
Custom management apps for Kafka
Custom management apps for KafkaCustom management apps for Kafka
Custom management apps for Kafka
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Efficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out DatabasesEfficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out Databases
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Automated Spark Deployment With Declarative Infrastructure
Automated Spark Deployment With Declarative InfrastructureAutomated Spark Deployment With Declarative Infrastructure
Automated Spark Deployment With Declarative Infrastructure
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive Streams
 
Apache Spark and Online Analytics
Apache Spark and Online Analytics Apache Spark and Online Analytics
Apache Spark and Online Analytics
 

Destaque

Architecture of a search engine
Architecture of a search engineArchitecture of a search engine
Architecture of a search engineSylvain Utard
 
This Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, Lucidworks
This Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, LucidworksThis Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, Lucidworks
This Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, LucidworksLucidworks
 
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Lucidworks
 
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...Lucidworks
 
Lucene/Solr Spatial in 2015: Presented by David Smiley
Lucene/Solr Spatial in 2015: Presented by David SmileyLucene/Solr Spatial in 2015: Presented by David Smiley
Lucene/Solr Spatial in 2015: Presented by David SmileyLucidworks
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...Lucidworks
 
Lucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will Hayes
Lucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will HayesLucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will Hayes
Lucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will HayesLucidworks
 
Evolving Search Relevancy: Presented by James Strassburg, Direct Supply
Evolving Search Relevancy: Presented by James Strassburg, Direct SupplyEvolving Search Relevancy: Presented by James Strassburg, Direct Supply
Evolving Search Relevancy: Presented by James Strassburg, Direct SupplyLucidworks
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterLucidworks
 
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.Lucidworks
 
Webinar: Ecommerce, Rules, and Relevance
Webinar: Ecommerce, Rules, and RelevanceWebinar: Ecommerce, Rules, and Relevance
Webinar: Ecommerce, Rules, and RelevanceLucidworks
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - SearchBillCavaUs
 
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Lucidworks
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksLucidworks
 
Enterprise Search Summit Keynote: A Big Data Architecture for Search
Enterprise Search Summit Keynote: A Big Data Architecture for SearchEnterprise Search Summit Keynote: A Big Data Architecture for Search
Enterprise Search Summit Keynote: A Big Data Architecture for SearchSearch Technologies
 
Basics of Search Engines and Algorithms
Basics of Search Engines and AlgorithmsBasics of Search Engines and Algorithms
Basics of Search Engines and AlgorithmsWeb Trainings Academy
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Lucidworks
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 

Destaque (20)

Architecture of a search engine
Architecture of a search engineArchitecture of a search engine
Architecture of a search engine
 
This Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, Lucidworks
This Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, LucidworksThis Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, Lucidworks
This Ain't Your Parent's Search Engine: Presented by Grant Ingersoll, Lucidworks
 
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
 
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
 
Lucene/Solr Spatial in 2015: Presented by David Smiley
Lucene/Solr Spatial in 2015: Presented by David SmileyLucene/Solr Spatial in 2015: Presented by David Smiley
Lucene/Solr Spatial in 2015: Presented by David Smiley
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
 
Lucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will Hayes
Lucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will HayesLucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will Hayes
Lucene/Solr Revolution 2015 Opening Keynote with Lucidworks CEO Will Hayes
 
Evolving Search Relevancy: Presented by James Strassburg, Direct Supply
Evolving Search Relevancy: Presented by James Strassburg, Direct SupplyEvolving Search Relevancy: Presented by James Strassburg, Direct Supply
Evolving Search Relevancy: Presented by James Strassburg, Direct Supply
 
Search at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, TwitterSearch at Twitter: Presented by Michael Busch, Twitter
Search at Twitter: Presented by Michael Busch, Twitter
 
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
 
Webinar: Ecommerce, Rules, and Relevance
Webinar: Ecommerce, Rules, and RelevanceWebinar: Ecommerce, Rules, and Relevance
Webinar: Ecommerce, Rules, and Relevance
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - Search
 
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
Autocomplete Multi-Language Search Using Ngram and EDismax Phrase Queries: Pr...
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
 
Enterprise Search Summit Keynote: A Big Data Architecture for Search
Enterprise Search Summit Keynote: A Big Data Architecture for SearchEnterprise Search Summit Keynote: A Big Data Architecture for Search
Enterprise Search Summit Keynote: A Big Data Architecture for Search
 
Basics of Search Engines and Algorithms
Basics of Search Engines and AlgorithmsBasics of Search Engines and Algorithms
Basics of Search Engines and Algorithms
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 

Semelhante a Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote

Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scalejimriecken
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
kafka simplicity and complexity
kafka simplicity and complexitykafka simplicity and complexity
kafka simplicity and complexityPaolo Platter
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationAmir Hossein Sorouri
 
Apache kafka- Onkar Kadam
Apache kafka- Onkar KadamApache kafka- Onkar Kadam
Apache kafka- Onkar KadamOnkar Kadam
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil
 
Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Perforce
 
Lucene Bootcamp - 2
Lucene Bootcamp - 2Lucene Bootcamp - 2
Lucene Bootcamp - 2GokulD
 
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr MeetupImproved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetuprcmuir
 
Case Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataCase Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataSchubert Zhang
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
SOA with PHP and Symfony
SOA with PHP and SymfonySOA with PHP and Symfony
SOA with PHP and SymfonyMichalSchroeder
 

Semelhante a Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote (20)

Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
kafka simplicity and complexity
kafka simplicity and complexitykafka simplicity and complexity
kafka simplicity and complexity
 
Flexible Indexing in Lucene 4.0
Flexible Indexing in Lucene 4.0Flexible Indexing in Lucene 4.0
Flexible Indexing in Lucene 4.0
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestration
 
Apache kafka- Onkar Kadam
Apache kafka- Onkar KadamApache kafka- Onkar Kadam
Apache kafka- Onkar Kadam
 
stigbot_beta
stigbot_betastigbot_beta
stigbot_beta
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
Solr 4
Solr 4Solr 4
Solr 4
 
Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale Still All on One Server: Perforce at Scale
Still All on One Server: Perforce at Scale
 
Lucene Bootcamp - 2
Lucene Bootcamp - 2Lucene Bootcamp - 2
Lucene Bootcamp - 2
 
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr MeetupImproved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
 
Case Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of DataCase Study - How Rackspace Query Terabytes Of Data
Case Study - How Rackspace Query Terabytes Of Data
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Serverless design with Fn project
Serverless design with Fn projectServerless design with Fn project
Serverless design with Fn project
 
SOA with PHP and Symfony
SOA with PHP and SymfonySOA with PHP and Symfony
SOA with PHP and Symfony
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
 

Mais de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Mais de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Último

A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 

Último (20)

A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 

Search Architecture at Evernote: Presented by Christian Kohlschütter, Evernote

  • 1. Search Architecture at Evernote Not Your Typical Big Data Problem CHRISTIAN KOHLSCHÜTTER Sr. Search Researcher Augmented Intelligence @ Evernote
  • 2. We are the workspace.
  • 4.
  • 6.
  • 7.
  • 8.
  • 10.
  • 11.
  • 12.
  • 13. Serving 100+ Million Users Worldwide • 559 Shards (200k users per shard), Linux/Tomcat/MySQL • 3.2 PB WebDAV-based Storage • 224 TB SSD capacity for System, MySQL and Lucene • 3.1 Billion Notes stored, 3.8 Bn Notes ever created • 115 Million Notes created or edited last week • 26 Million API calls to Context last week • 1 Lucene index per user
  • 14. Evernote’s Three Laws of Data Protection • Your Data is Yours • Your Data is Protected • Your Data is Portable We are not a “big data” company and do not try to make money from your content.
  • 15. Technical Debt • I/O over Lucene 2.9 indexes became a bottleneck • Code was woven into our “NoteStore” platform • Index changes had to be backwards-compatible • Complex re-indexing would require taking down a shard • Needed to rethink the entire architecture, but keep public API • Make search faster vs. Make us move faster
  • 16. From Lucene 2.9 to 4.x and beyond • Large refactoring of search code • Lucene no longer is a direct dependency in “NoteStore” • Design-by-Contract • Can now run multiple Lucene versions concurrently in one VM • … and one specific version / schema per user • Migrated all users to Lucene 4.5, avg. downtime/user < 1 min
  • 17. Separate the What from the How
  • 18. Separation of Concerns UserIndex Manager UserIndex Factory NoteStore UserIndex Benchmarking UserIndex Lucene29 UserIndexImpl Lucene4 UserIndexImpl API Implementation Caching UserIndex ...
  • 19. Hide Lucene behind ClassLoaders • One Maven artifact per major Lucene version, build profiles for code-reuse between minor updates • Code is packaged with dependencies into one common fat-jar with prefixes for each implementation: - lucene29/org/apache/lucene/... lucene29/com/evernote/search/lucene2/… - lucene43/org/apache/lucene/... lucene43/com/evernote/search/lucene4/… - lucene45/org/apache/lucene/… lucene45/com/evernote/search/lucene4/… • ResourcePrefixClassLoader called from outside code strips prefix, uses fat-jar as the only dependency
  • 20. New Index Structure • Each user’s index now comes with a properties file that describes its internal structures, such as index type and version. Can handle different behavior in code. • Changes to the index schema? Just increase the index version and handle the rest in code • Automatically trigger re-indexing if necessary
  • 21. Index Auto-Migration • Target Default Index Implementation centrally set by DevOps • Triggered upon UserIndex access • UserIndex facade determines whether re-index is necessary • “Cruise Control” automates off-peak access # Threads
  • 22. Phase 1: Migration to Lucene 4 • Changes in Disk I/O (CPU correlates) overall: -81% searchRelatedNotes: -87% keyword-based search: -96% Saves TBs of I/O
  • 23. Phase 2: Add Compression • User Indexes sizes and access patterns are skewed • Optimize large accounts • Directory-level compression • Compress segment files, invisible to the IndexReader • Only when re-indexing / every 3 months • In-memory Caching
  • 24. LuceneTransform • https://code.google.com/p/lucenetransform by Mitja Lenič • We ported it to Lucene 4.5 (now available upstream for 4.9) • Improved LRU caching, added LZ4/Snappy compression • We will contribute our changes soon
  • 25. OverlayDirectory on disk: _23.cfe _23.si c$_23.cfs segments.gen segments_2 visible to IndexReader: _23.cfe _23.si _23.cfs segments.gen segments_2
  • 26. Results • Compressed the largest 5% of all indexes using LZ4 • 1.9 TB index space saved • 100 MB LRU Cache hit rate: 79% on avg (67% — 93%) • Saved 0.5 PB disk reads/week • Cache is so good, may use better/slower compression algorithm, may apply to more users Saves PBs of I/O
  • 27. Bugs, Bugs, Bugs :-) • We’ve been warned “VInt bug” “background merge hit exception” JVM segfaults ! • and then this happened, too SPI / ContextClassLoaders … LUCENE-4713 Deadlocks / over-optimistic locking Unclosed resources / Too many open file handles => HousekeepingDirectory Issues with FieldCache singleton => LUCENE-831, LUCENE-2133, … … • UserIndex tracks “broken” state; allows self-healing (rebuild)
  • 28. Conclusion • Design-By-Contract, Separation of Concerns • Per-user Search Implementation / Multiple Lucene versions • Migrated 60M users, without noticeable downtime • Migration allowed index changes, saves TBs of disk I/O • Block-level Index Compression, saves PBs of disk I/O • This is just the beginning.