SlideShare uma empresa Scribd logo
1 de 37
1/37
ElasticSearch
feedback
2/37
Introduction
3/37
Nicolas Blanc - BlaBlArchitect
SinfomicSinfomic
(1999)
@thewhitegeek
(2001)
(2005)
(2008)
(2012)
4/37
What is BlaBlaCar ?
5/37
3 000 000MEMBERS
IN EUROPE
6/37
10 9 countries10 9 countries
● France
● Spain
● Italy
● UK
● Poland
● Portugal
● Netherlands
● Belgium
● Luxemburg
● NEW Germany
● France
● Spain
● Italy
● UK
● Poland
● Portugal
● Netherlands
● Belgium
● Luxemburg
7/37
Growth
50 millions
25 millions
January
2008
January
2013
8/37
Infrastructure
 2 front web servers
 2 MySQL master (+4 slaves SSD)
 1 private cloud
(KVM + Open vSwitch)
●
Redis
●
Memcache
●
RabbitMQ/workers
 1 cluster ElasticSearch
9/37
Changing the Search Engine
10/37
What's existing ? Why Changing ?
MySQL Database
●
Relationnal DB (lots of join needed)
●
Plain SQL query
●
Home made geographical search
Recent problems
●
New feature, means more complex queries
●
Scalability : Performance depending on DB load
11/37
Initial requirements
Scalability
●
Trip search need to be made in less than 200ms
●
The system part of the solution easy to maintain
●
Be able to cluster it (also to not have SPOF)
Low code impact on existing application
●
Same features as of today (geographical search)
●
Minimize the developper's work
●
Add one missing feature : facets
12/37
Initial Competitors
SenseiDB
13/37
Why ElasticSearch
✔
Easyest cluster possibility
✔
Good performance when indexing
✔
Few code to write to use it
✔
Schema less
✔
Based on Lucene
✔
Written in Java (need to code grouping feature)
14/37
ElasticSearch has won,
now migrate our search !
15/37
Changing our mindset
Object in Relationnal Database
●
Can be exploded on multiple tables
●
Lots of informations usable by JOIN
Object in Document Oriented Database
●
Only one big index for theses objects
●
All informations need to be in the object, not on
multiple tables
16/37
Changing our mindset
Object in Relationnal Database
●
Can be exploded on multiple tables
●
Lots of informations usable by JOIN
Object in Document Oriented Database
●
Only one big index for theses objects
●
All informations need to be in the object, not on
multiple tables
17/37
Well defining our objects
Need to know what we want to search
●
Searching trips (front office usage)
●
Searching members (backoffice usage)
●
Searching FAQ (front office usage)
Think of all needed field
●
The ones used for query
●
The ones used for filters
●
The ones used for facets
18/37
Thinking of well defining index
System point of view
●
Number of Nodes in the cluster
●
Number of Shards
●
Number of Replica
Application point of view
●
Define type and attributes for all fields (mapping)
●
Using parent/child or nested to improve indexing
●
How to push documents from DB ?
19/37
Indexing : using a river or not ?
River advantages
●
Plugs directly to our source backend
●
ElasticSearch API exists to code a new one
River problems
●
Not easy to add business logic on some fields
●
Really hard when your DB is unconventionnal
●
Full Reindex all the documents
20/37
Indexing : our manual way
We write an asynchronous indexer
●
Written in java
●
Have business logic when fetching from db
●
Fetch from multiple DB/source
●
Use of java ES library
●
Easy interface
●
send {“trip”:1234567} and the server answer {“OK”}
21/37
One index sample : Trip
22/37
Well defining our object Trip
Think of all needed field
●
The ones used for query
●
Trip date of departure,from where,to where,user id
●
The ones used for filters
●
User ratings,price,vehicle,seats left,is user blocked
(a blocked user, is a user who made some forbidden
action on the website.)
●
The ones used for facets
●
User ratings,price,vehicle
23/37
Well defining our index Trip
Think of all system requirement
●
The cluster has 2 nodes
●
We keep the default configuration for shards/replica
Think of object mapping
●
For each field :
●
Define the type (string, long, geo_point, date,
float, boolean)
●
Define the scope (include_in_all)
●
Define the analyzer (for type string)
24/37
Trip Mapping
"trip": {
"properties": {
"is_user_blocked": {
"type": "boolean",
"include_in_all" : false
},
"user_ratings" : {
"type" : "long",
"include_in_all" : false
},
"from": {
"type": "geo_point",
"include_in_all" : false
},
"price": {
"include_in_all": false,
"type": "float"
},
"price_euro": {
"type": "float",
“include_in_all: false
},
"seats_left": {
"include_in_all": false,
"type": "long"
},
"seats_offered": {
"include_in_all": false,
"type": "long"
},
"to": {
"include_in_all": false,
"type": "geo_point"
},
"trip_date": {
"format": "dateOptionalTime",
"include_in_all": false,
"type": "date"
},
“vehicle”: {
"include_in_all": false,
"type": "string"
},
"userid": {
"include_in_all": false,
"index": "not_analyzed",
"type": "string"
}
}
}
25/37
Well indexing events
Which modification send event change
●
All trips creation/deletion/modification
●
Member modifications (block or not)
●
New ratings from other members
●
A seat has been reserved
●
Member change his vehicle
Event change is a call to internal indexer
●
Send '{“trip”:123456}' to indexer (create/update)
●
Send '{“tripd”:123456}' to indexer (delete)
26/37
Sample trip index query
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"and": [{
"geo_distance": {
"distance": "40.14937866995km",
"from": {
"lat": 48.856614,
"lon": 2.3522219
}
}
}, {
"geo_distance": {
"distance": "40.14937866995km",
"to": {
"lat": 45.764043,
"lon": 4.835659
}
}
},
{
"range": {
"price": {
"from": 0,
"include_lower": false
}
}
}]
}
}
},
"sort": [{
"trip_date": { "order": "asc" },
}],
"filter": {
"term": { "is_user_blocked": false }
}
},
"from": 0,
"size": 10
}
27/37
The Real World
A trip has now more than 30 fields
●
(faq is around 25 fields)
●
(members even more...)
To build a trip document we need 3
differents SQL queries
●
(FAQ : 2 differents SQL queries)
●
(Member : 10 differents SQL queries)
A trip has only 1 shard (grouping)
28/37
And now the caveats
29/37
Preloaded Scripts
We use mvel script to improve scoring
●
They are not clustered
●
Each node need to have the scripts
●
Need a node restart to be added or modified
Solution : Chef (tool from Opscode)
All nodes configurations are centralized into Chef
repository
30/37
Grouping documents
Home made patchs to ElasticSearch
(based on a Martijn Van Groningen work for
lusini.de)
Soon in ElasticSearch
(I hope so much)
31/37
Mapping modification
On a running index :
Changing a type is not allowed
Changing analyzer is not allowed
Solution : index alias
1) Changing mapping → create a new index
2) When new index is up to date → changing alias
32/37
IOs limits
We have only 2 nodes
●
Trip index is around 2GB
●
But only 1 shard for Trip index
●
Can index 100 trips / seconds on busy evening
Solution : We put Intel SSDs
(waiting for distributed grouping feature)
33/37
Choosing the analyzer
Some field need to not be analyzed
●
If you use ISO code for country
(IT, for Italy or DE for Germany are ignored in
some cases)
Global analyzer has limits
●
Accentuation from countries like France,
Germany or Spain are not always parsed correctly
●
One analyzer by country is difficult to implement
in some cases
34/37
OK Sweet,
What's next
?
35/37
Using ElasticSearch to ease log analysis
36/37
By the way…
We’re hiring !!!
Dev, HTML Ninja, leader,…
Come & See me right now
… or send me your friends 
(And we have beer, baby foot and arcade cabinet  )
37/37
Thank you !
Follow us !
@covoiturage
Apply now :
join@BlaBlaCar.com

Mais conteúdo relacionado

Mais procurados

JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...jazoon13
 
BKK16-411 Devicetree Specification
BKK16-411 Devicetree SpecificationBKK16-411 Devicetree Specification
BKK16-411 Devicetree SpecificationLinaro
 
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...Linaro
 
Manage your bare-metal infrastructure with a CI/CD-driven approach
Manage your bare-metal infrastructure with a CI/CD-driven approachManage your bare-metal infrastructure with a CI/CD-driven approach
Manage your bare-metal infrastructure with a CI/CD-driven approachinovex GmbH
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusGabriele Di Bernardo
 
Sprint 38 review
Sprint 38 reviewSprint 38 review
Sprint 38 reviewManageIQ
 
BKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle timeBKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle timeLinaro
 
Infrastructure as code
Infrastructure as codeInfrastructure as code
Infrastructure as codeRoman Komkov
 
Provisioning with Stacki at NIST
Provisioning with Stacki at NISTProvisioning with Stacki at NIST
Provisioning with Stacki at NISTStackIQ
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
 
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapUtilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapGeorge Markomanolis
 
BKK16-306 ART ii
BKK16-306 ART iiBKK16-306 ART ii
BKK16-306 ART iiLinaro
 
High-Performance Computing with C++
High-Performance Computing with C++High-Performance Computing with C++
High-Performance Computing with C++JetBrains
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephScyllaDB
 
Understanding Open Source Serverless Platforms: Design Considerations and Per...
Understanding Open Source Serverless Platforms: Design Considerations and Per...Understanding Open Source Serverless Platforms: Design Considerations and Per...
Understanding Open Source Serverless Platforms: Design Considerations and Per...Johnny Li
 
Real-time Debugging using GDB Tracepoints and other Eclipse features
Real-time Debugging using GDB Tracepoints and other Eclipse features Real-time Debugging using GDB Tracepoints and other Eclipse features
Real-time Debugging using GDB Tracepoints and other Eclipse features marckhouzam
 
TIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANGTIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANGThe Incredible Automation Day
 

Mais procurados (20)

JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
 
BKK16-411 Devicetree Specification
BKK16-411 Devicetree SpecificationBKK16-411 Devicetree Specification
BKK16-411 Devicetree Specification
 
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
LAS16-301: OpenStack on Aarch64, running in production, upstream improvements...
 
Manage your bare-metal infrastructure with a CI/CD-driven approach
Manage your bare-metal infrastructure with a CI/CD-driven approachManage your bare-metal infrastructure with a CI/CD-driven approach
Manage your bare-metal infrastructure with a CI/CD-driven approach
 
Coal 9 pipelining in Assembly Programming
Coal 9 pipelining in Assembly ProgrammingCoal 9 pipelining in Assembly Programming
Coal 9 pipelining in Assembly Programming
 
Qt5 beta1 on ti platforms
Qt5 beta1 on ti platformsQt5 beta1 on ti platforms
Qt5 beta1 on ti platforms
 
Large Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - NautilusLarge Scale Computing Infrastructure - Nautilus
Large Scale Computing Infrastructure - Nautilus
 
Sprint 38 review
Sprint 38 reviewSprint 38 review
Sprint 38 review
 
BKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle timeBKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle time
 
Infrastructure as code
Infrastructure as codeInfrastructure as code
Infrastructure as code
 
Provisioning with Stacki at NIST
Provisioning with Stacki at NISTProvisioning with Stacki at NIST
Provisioning with Stacki at NIST
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
 
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapUtilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmap
 
BKK16-306 ART ii
BKK16-306 ART iiBKK16-306 ART ii
BKK16-306 ART ii
 
High-Performance Computing with C++
High-Performance Computing with C++High-Performance Computing with C++
High-Performance Computing with C++
 
Seastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for CephSeastore: Next Generation Backing Store for Ceph
Seastore: Next Generation Backing Store for Ceph
 
Eclipse PTP in AICS
Eclipse PTP in AICSEclipse PTP in AICS
Eclipse PTP in AICS
 
Understanding Open Source Serverless Platforms: Design Considerations and Per...
Understanding Open Source Serverless Platforms: Design Considerations and Per...Understanding Open Source Serverless Platforms: Design Considerations and Per...
Understanding Open Source Serverless Platforms: Design Considerations and Per...
 
Real-time Debugging using GDB Tracepoints and other Eclipse features
Real-time Debugging using GDB Tracepoints and other Eclipse features Real-time Debugging using GDB Tracepoints and other Eclipse features
Real-time Debugging using GDB Tracepoints and other Eclipse features
 
TIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANGTIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANG
 

Semelhante a BlaBlaCar Elastic Search Feedback

Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache SparkLucian Neghina
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleOracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleEDB
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics MongoDB
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fastDenis Karpenko
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Omid Vahdaty
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overheadCass Everitt
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I💻 Anton Gerdelan
 
Feature engineering pipelines
Feature engineering pipelinesFeature engineering pipelines
Feature engineering pipelinesRamesh Sampath
 
Devoxx : being productive with JHipster
Devoxx : being productive with JHipsterDevoxx : being productive with JHipster
Devoxx : being productive with JHipsterJulien Dubois
 
Dfrws eu 2014 rekall workshop
Dfrws eu 2014 rekall workshopDfrws eu 2014 rekall workshop
Dfrws eu 2014 rekall workshopTamas K Lengyel
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015Jorg Janke
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...Rob Skillington
 
Apache spark - Spark's distributed programming model
Apache spark - Spark's distributed programming modelApache spark - Spark's distributed programming model
Apache spark - Spark's distributed programming modelMartin Zapletal
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simpleDori Waldman
 

Semelhante a BlaBlaCar Elastic Search Feedback (20)

Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration HustleOracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration Hustle
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
 
Druid
DruidDruid
Druid
 
[scala.by] Launching new application fast
[scala.by] Launching new application fast[scala.by] Launching new application fast
[scala.by] Launching new application fast
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Approaching zero driver overhead
Approaching zero driver overheadApproaching zero driver overhead
Approaching zero driver overhead
 
Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
2D graphics
2D graphics2D graphics
2D graphics
 
Feature engineering pipelines
Feature engineering pipelinesFeature engineering pipelines
Feature engineering pipelines
 
Devoxx : being productive with JHipster
Devoxx : being productive with JHipsterDevoxx : being productive with JHipster
Devoxx : being productive with JHipster
 
Dfrws eu 2014 rekall workshop
Dfrws eu 2014 rekall workshopDfrws eu 2014 rekall workshop
Dfrws eu 2014 rekall workshop
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015
 
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
 
Apache spark - Spark's distributed programming model
Apache spark - Spark's distributed programming modelApache spark - Spark's distributed programming model
Apache spark - Spark's distributed programming model
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
 

Último

08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking Men08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking MenDelhi Call girls
 
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...Apsara Of India
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh HaldighatiApsara Of India
 
Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Sherazi Tours
 
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-..."Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...Ishwaholidays
 
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyHire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyNitya salvi
 
ITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsMarco Mazzeschi
 
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultantvisa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa ConsultantSherazi Tours
 
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...Find American Rentals
 
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.Nitya salvi
 
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday SafarisKibera Holiday Safaris Safaris
 
DARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda BuxDARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda BuxBeEducate
 
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceKanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
Book Cheap Flight Tickets - TraveljunctionUK
Book  Cheap Flight Tickets - TraveljunctionUKBook  Cheap Flight Tickets - TraveljunctionUK
Book Cheap Flight Tickets - TraveljunctionUKTravel Juncation
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxseri bangash
 
Genesis 1:6 || Meditate the Scripture daily verse by verse
Genesis 1:6  ||  Meditate the Scripture daily verse by verseGenesis 1:6  ||  Meditate the Scripture daily verse by verse
Genesis 1:6 || Meditate the Scripture daily verse by versemaricelcanoynuay
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxdishha99
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking MenDelhi Call girls
 

Último (20)

08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking Men08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking Men
 
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
 
Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236
 
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-..."Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
 
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyHire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
 
ITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomads
 
Discover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdfDiscover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdf
 
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultantvisa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
 
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
 
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
 
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
 
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
 
DARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda BuxDARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda Bux
 
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceKanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Book Cheap Flight Tickets - TraveljunctionUK
Book  Cheap Flight Tickets - TraveljunctionUKBook  Cheap Flight Tickets - TraveljunctionUK
Book Cheap Flight Tickets - TraveljunctionUK
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptx
 
Genesis 1:6 || Meditate the Scripture daily verse by verse
Genesis 1:6  ||  Meditate the Scripture daily verse by verseGenesis 1:6  ||  Meditate the Scripture daily verse by verse
Genesis 1:6 || Meditate the Scripture daily verse by verse
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptx
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men
 

BlaBlaCar Elastic Search Feedback

  • 3. 3/37 Nicolas Blanc - BlaBlArchitect SinfomicSinfomic (1999) @thewhitegeek (2001) (2005) (2008) (2012)
  • 6. 6/37 10 9 countries10 9 countries ● France ● Spain ● Italy ● UK ● Poland ● Portugal ● Netherlands ● Belgium ● Luxemburg ● NEW Germany ● France ● Spain ● Italy ● UK ● Poland ● Portugal ● Netherlands ● Belgium ● Luxemburg
  • 8. 8/37 Infrastructure  2 front web servers  2 MySQL master (+4 slaves SSD)  1 private cloud (KVM + Open vSwitch) ● Redis ● Memcache ● RabbitMQ/workers  1 cluster ElasticSearch
  • 10. 10/37 What's existing ? Why Changing ? MySQL Database ● Relationnal DB (lots of join needed) ● Plain SQL query ● Home made geographical search Recent problems ● New feature, means more complex queries ● Scalability : Performance depending on DB load
  • 11. 11/37 Initial requirements Scalability ● Trip search need to be made in less than 200ms ● The system part of the solution easy to maintain ● Be able to cluster it (also to not have SPOF) Low code impact on existing application ● Same features as of today (geographical search) ● Minimize the developper's work ● Add one missing feature : facets
  • 13. 13/37 Why ElasticSearch ✔ Easyest cluster possibility ✔ Good performance when indexing ✔ Few code to write to use it ✔ Schema less ✔ Based on Lucene ✔ Written in Java (need to code grouping feature)
  • 14. 14/37 ElasticSearch has won, now migrate our search !
  • 15. 15/37 Changing our mindset Object in Relationnal Database ● Can be exploded on multiple tables ● Lots of informations usable by JOIN Object in Document Oriented Database ● Only one big index for theses objects ● All informations need to be in the object, not on multiple tables
  • 16. 16/37 Changing our mindset Object in Relationnal Database ● Can be exploded on multiple tables ● Lots of informations usable by JOIN Object in Document Oriented Database ● Only one big index for theses objects ● All informations need to be in the object, not on multiple tables
  • 17. 17/37 Well defining our objects Need to know what we want to search ● Searching trips (front office usage) ● Searching members (backoffice usage) ● Searching FAQ (front office usage) Think of all needed field ● The ones used for query ● The ones used for filters ● The ones used for facets
  • 18. 18/37 Thinking of well defining index System point of view ● Number of Nodes in the cluster ● Number of Shards ● Number of Replica Application point of view ● Define type and attributes for all fields (mapping) ● Using parent/child or nested to improve indexing ● How to push documents from DB ?
  • 19. 19/37 Indexing : using a river or not ? River advantages ● Plugs directly to our source backend ● ElasticSearch API exists to code a new one River problems ● Not easy to add business logic on some fields ● Really hard when your DB is unconventionnal ● Full Reindex all the documents
  • 20. 20/37 Indexing : our manual way We write an asynchronous indexer ● Written in java ● Have business logic when fetching from db ● Fetch from multiple DB/source ● Use of java ES library ● Easy interface ● send {“trip”:1234567} and the server answer {“OK”}
  • 22. 22/37 Well defining our object Trip Think of all needed field ● The ones used for query ● Trip date of departure,from where,to where,user id ● The ones used for filters ● User ratings,price,vehicle,seats left,is user blocked (a blocked user, is a user who made some forbidden action on the website.) ● The ones used for facets ● User ratings,price,vehicle
  • 23. 23/37 Well defining our index Trip Think of all system requirement ● The cluster has 2 nodes ● We keep the default configuration for shards/replica Think of object mapping ● For each field : ● Define the type (string, long, geo_point, date, float, boolean) ● Define the scope (include_in_all) ● Define the analyzer (for type string)
  • 24. 24/37 Trip Mapping "trip": { "properties": { "is_user_blocked": { "type": "boolean", "include_in_all" : false }, "user_ratings" : { "type" : "long", "include_in_all" : false }, "from": { "type": "geo_point", "include_in_all" : false }, "price": { "include_in_all": false, "type": "float" }, "price_euro": { "type": "float", “include_in_all: false }, "seats_left": { "include_in_all": false, "type": "long" }, "seats_offered": { "include_in_all": false, "type": "long" }, "to": { "include_in_all": false, "type": "geo_point" }, "trip_date": { "format": "dateOptionalTime", "include_in_all": false, "type": "date" }, “vehicle”: { "include_in_all": false, "type": "string" }, "userid": { "include_in_all": false, "index": "not_analyzed", "type": "string" } } }
  • 25. 25/37 Well indexing events Which modification send event change ● All trips creation/deletion/modification ● Member modifications (block or not) ● New ratings from other members ● A seat has been reserved ● Member change his vehicle Event change is a call to internal indexer ● Send '{“trip”:123456}' to indexer (create/update) ● Send '{“tripd”:123456}' to indexer (delete)
  • 26. 26/37 Sample trip index query { "query": { "filtered": { "query": { "match_all": {} }, "filter": { "and": [{ "geo_distance": { "distance": "40.14937866995km", "from": { "lat": 48.856614, "lon": 2.3522219 } } }, { "geo_distance": { "distance": "40.14937866995km", "to": { "lat": 45.764043, "lon": 4.835659 } } }, { "range": { "price": { "from": 0, "include_lower": false } } }] } } }, "sort": [{ "trip_date": { "order": "asc" }, }], "filter": { "term": { "is_user_blocked": false } } }, "from": 0, "size": 10 }
  • 27. 27/37 The Real World A trip has now more than 30 fields ● (faq is around 25 fields) ● (members even more...) To build a trip document we need 3 differents SQL queries ● (FAQ : 2 differents SQL queries) ● (Member : 10 differents SQL queries) A trip has only 1 shard (grouping)
  • 28. 28/37 And now the caveats
  • 29. 29/37 Preloaded Scripts We use mvel script to improve scoring ● They are not clustered ● Each node need to have the scripts ● Need a node restart to be added or modified Solution : Chef (tool from Opscode) All nodes configurations are centralized into Chef repository
  • 30. 30/37 Grouping documents Home made patchs to ElasticSearch (based on a Martijn Van Groningen work for lusini.de) Soon in ElasticSearch (I hope so much)
  • 31. 31/37 Mapping modification On a running index : Changing a type is not allowed Changing analyzer is not allowed Solution : index alias 1) Changing mapping → create a new index 2) When new index is up to date → changing alias
  • 32. 32/37 IOs limits We have only 2 nodes ● Trip index is around 2GB ● But only 1 shard for Trip index ● Can index 100 trips / seconds on busy evening Solution : We put Intel SSDs (waiting for distributed grouping feature)
  • 33. 33/37 Choosing the analyzer Some field need to not be analyzed ● If you use ISO code for country (IT, for Italy or DE for Germany are ignored in some cases) Global analyzer has limits ● Accentuation from countries like France, Germany or Spain are not always parsed correctly ● One analyzer by country is difficult to implement in some cases
  • 35. 35/37 Using ElasticSearch to ease log analysis
  • 36. 36/37 By the way… We’re hiring !!! Dev, HTML Ninja, leader,… Come & See me right now … or send me your friends  (And we have beer, baby foot and arcade cabinet  )
  • 37. 37/37 Thank you ! Follow us ! @covoiturage Apply now : join@BlaBlaCar.com