SlideShare uma empresa Scribd logo
1 de 37
Backend Group - build to scale
Ran Levy, Backend Director
ranl@myheritage.com
Agenda
• Introduction to MyHeritage
• R&D structure to support scaling
• R&D methodology to support scaling up
• Scaling up technologies and solutions
– Micro-services architecture
– Relational DB scaling out
– Data storing for low latency
– SOLR scaling up
– Queuing services
– File servers
– Caching services
– Statistics services
Family history for Families
Building next generation tools for family history
enthusiasts and their families
Discover Preserve Share
Challenge: Scale
77 million registered users
1.7 billion tree profiles in 27 million trees
6 billion historical records
200 million photos
42 languages
1 million daily emails
R&D structure to support scaling up –
guilds and band
Band Master
Missions
Skill
Guilds(ExpertiseandQuality)
Bands (Delivery)
Guild Manager
Product Owner
…
Guild member
R&D Methodology to support scaling up
• Full continuous deployment
–All developers are working on trunk
–Commit triggers flow that ends in
production update
R&D Methodology to support scaling up
• Procedure is backed up with:
–Exposure flag (controlled by external UI)
–Code reviews
–Unit/integration tests (over 80% coverage)
–Sensors for each released features.
–Automatic logs and stats scanning
R&D Methodology to support scaling up
R&D Methodology to support scaling up
R&D Methodology to support scaling up
Agenda
 Introduction to MyHeritage
 R&D structure to support scaling
 R&D methodology to support scaling up
• Scaling up technologies and solutions
– Micro-services architecture
– Relational DB scaling out
– Data storing for low latency
– SOLR scaling up
– Queuing services
– File servers
– Caching services
– Statistics services
Micro-services architecture
• Monolithic code can’t scale for long
– Localization of changes
– Concurrency of development
– Limits variety of coding languages
– Scaling up specific services
Micro-services architecture
• Solution:
– Micro-services architecture
• Migration from monolithic code is gradual
– Starting with isolated service
– Gradual replacement of core services
Relational DB scaling out techniques
• Data sharding
• Master – slaves
• Using MySQL 5.5 Percona
Relational DB scaling out techniques –
approaches for data sharding
• Consistent hashing based on key
• Used for MyHeritage Historical Records (6B records)
Func(ABCD)
Read(ABCD)
Relational DB scaling out techniques –
approaches for data sharding
• Consistent hashing pros & cons
– Pros:
• Supports high performance lookup
• “Infinite scale”
– Cons:
• Re-sharding is not trivial and requires code change.
Relational DB scaling out techniques –
approaches for data sharding
• Mapping table
• Use case: Users’ data in MyHeritage
Read(xyz)
Read(XYZ) from
Specific DB instance
XYZ key
lookup
Relational DB scaling out techniques –
approaches for data sharding
• Mapping table pros & cons
– Pros:
• Easy re-sharding and scaling up.
– Cons:
• Requires DB lookup prior to data access.
• Limited scalability.
Relational DB scaling out techniques –
Master Slave
Active standby
R/W
flow
R/O
flow
Master
Data Storing for low latency
• (Berkley DB, MapDB)
• Cassandra
– Account Store
– People Store
– (Counters system, A/B testing data)
Data Storing for low latency – Account Store
• Motivations
– Access account data in sub 1 msec
– High scale (~400M rows)
– Online schema changes
– Reduce OPEX
– Linear Scaling out architecture
Data Storing for low latency – Account Store
• Solution:
– Cassandra
– Apache Cassandra is an open source, distributed, decentralized,
elastically scalable, highly available, fault-tolerant, tuneable
consistent, column-oriented database.
Data Storing for low latency – Account Store
• Cluster main characteristics:
– 5 nodes, 500GB SSD, Replication Factor - 3
– Community Edition 2.0.13
• Very low maintenance (no repair –pr )
• Using counters
• Using secondary indexes
• Using VNodes for easier maintenance
• Using SizeTieredCompactionStrategy compactions (writes optimized)
• Achieved performance
– Avg. local read latency: 0.108 ms
– Avg. local write latency: 0.022 ms
Data Storing for low latency – People Store
(in progress)
• Main Motivations
– Access data rapidly
• Avoiding the need to access multiple partitions
– High scale (scaling to 2B rows)
Search technologies
• Motivations
– Search billion of records in sub 200 msec.
– Cope with differences: languages, spellings,
inaccuracies, missing data.
– Ranking of results.
Search technologies
• Solution:
– SOLR
– Solr is highly reliable, scalable and fault tolerant, providing
distributed indexing, replication and load-balanced querying,
automated failover and recovery, centralized configuration and
more.
Search technologies - SOLR
• Solr distributed search allows sharding a big index into smaller chunks
running on multiple hosts. We do not utilize Solr 4’s SolrCloud feature.
• Indexing: Client app is responsible to index each document on a specific shard
(using some hashing of document ID)
• Search: Client app sends request to aggregator Solr instance, which in turn
queries all shards, and merges the results into one response (sort, paging)
Index Shards:
Application: Indexing
Solr Solr Solr
Indexing
Search
. . .
Search
Aggregator
Solr
Search technologies - SOLR
• Indexing hits performance of searching
• Split indexing to separate machines
• Single points of failure: aggregator
Load Balancer
(HA Proxy)
Solr SolrSolr Solr
. . .
SolrSolr
Indexer Solr Indexer Solr . . .
Searcher Solr Searcher Solr
Replication
Replication
Indexing
Indexer Solr
Searcher Solr
Replication
Search
SolrSolrAggregator
Solr
Load Balancer
(HA Proxy)
NULL
Solr
NULL
SolrNULL
Solr
Static
Resp.
Queuing services
• (In-house queue implementation)
• (Beanstalkd)
• Kafka
– Kafka is a distributed, partitioned, replicated commit log service.
Queuing System – Kafka High Level
Overview
Broker 1
Family Tree
changes Topic
part 1
part 2
part 32
Indexing
Consumers
RecordMatching
Logstash reader
Web
Producers
Daemons
Face recog.
Activity Topic
part 1
part 2
part 32
DRBD
replica
Of
Broker
2
Broker 2
Family Tree
changes Topic
part 1
part 2
part 32 DRBD
replica
Of
Broker
1
…
…
…
…
Notifications sys.
Notifications
Topic
Activity Topic
part 1
part 2
part 32
…
Notifications
Topic
Kafka @Myheritage – Consumers (Indexing)
EventProcessor
1 Per consumer
type, reader per
partition
Broker 2
Broker 1
EventProcessor
EventProcessor
IndexingQueue
IndexingWorkers
IndexingWorkers
IndexingWorkers
Fetch work
SOLR
Update item
KafkaWatermark
Get/update watermark
Add event to queue
File servers
• Traditional – File Servers
–~30 file servers
–Total storage: 80 TB
–HTTP(s) accessible with REST APIs
File servers
• CEPH
– Use cases:
• SEO serving
• OpenStack
– Version in production: FireFly
– Using 40TB
– Lessons learnt:
• Do not use large buckets without index sharding (support from Hammer)
• If you can’t use Hammer shard your buckets (or bad things WILL happen)
• Don’t use the high density nodes
Caching services
• Memcached
• In research: Memcached proxy
• (CDN)
Statistics Services
• In-house MySQL
• Graphite usage for Infrastructure
• In research for app metrics:
– Graphite over InfluxDB
– Cyanite (Graphite over Cassandra)
• Automated Anomaly Detection for infrastructure (Anodot)
Logging Services
• Central logging (including app logging + infrastructure):
in-house in MySQL + ELK stack
Ran Levy, Backend Director
ranl@myheritage.com
We are hiring!

Mais conteúdo relacionado

Mais procurados

Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...
Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...
Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...Nagios
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesJosef Adersberger
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Nagios
 
How we scaled Rudder to 10k, and the road to 50k
How we scaled Rudder to 10k, and the road to 50kHow we scaled Rudder to 10k, and the road to 50k
How we scaled Rudder to 10k, and the road to 50kRUDDER
 
What's New in OpenLDAP
What's New in OpenLDAPWhat's New in OpenLDAP
What's New in OpenLDAPLDAPCon
 
OSMC 2012 | Zabbix 2.0: Even Better by Rihards Olups
OSMC 2012 | Zabbix 2.0: Even Better by Rihards OlupsOSMC 2012 | Zabbix 2.0: Even Better by Rihards Olups
OSMC 2012 | Zabbix 2.0: Even Better by Rihards OlupsNETWAYS
 
JavaEdge 2008: Your next version control system
JavaEdge 2008: Your next version control systemJavaEdge 2008: Your next version control system
JavaEdge 2008: Your next version control systemGilad Garon
 
Nagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - FailoverNagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - FailoverNagios
 
Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...
Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...
Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...Gilad Garon
 
SFO15-110: Toolchain Collaboration
SFO15-110: Toolchain CollaborationSFO15-110: Toolchain Collaboration
SFO15-110: Toolchain CollaborationLinaro
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios
 
Docker in a big company
Docker in a big companyDocker in a big company
Docker in a big companyDocker, Inc.
 
Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best PracticesNagios
 
Plany Konserwacji SQL Server dla żółtodziobów
Plany Konserwacji SQL Server dla żółtodziobówPlany Konserwacji SQL Server dla żółtodziobów
Plany Konserwacji SQL Server dla żółtodziobówTobias Koprowski
 
Webinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High AvailabilityWebinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High AvailabilitySeveralnines
 
KoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginners
KoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginnersKoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginners
KoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginnersTobias Koprowski
 

Mais procurados (20)

Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...
Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...
Nagios Conference 2014 - Jeremy Rust - Avoiding Downtime Using Linux High Ava...
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
How we scaled Rudder to 10k, and the road to 50k
How we scaled Rudder to 10k, and the road to 50kHow we scaled Rudder to 10k, and the road to 50k
How we scaled Rudder to 10k, and the road to 50k
 
What's New in OpenLDAP
What's New in OpenLDAPWhat's New in OpenLDAP
What's New in OpenLDAP
 
OSMC 2012 | Zabbix 2.0: Even Better by Rihards Olups
OSMC 2012 | Zabbix 2.0: Even Better by Rihards OlupsOSMC 2012 | Zabbix 2.0: Even Better by Rihards Olups
OSMC 2012 | Zabbix 2.0: Even Better by Rihards Olups
 
JavaEdge 2008: Your next version control system
JavaEdge 2008: Your next version control systemJavaEdge 2008: Your next version control system
JavaEdge 2008: Your next version control system
 
Nagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - FailoverNagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2012 - Mike Weber - Failover
 
Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...
Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...
Continuous Deployment into the Unknown with Artifactory, Bintray, Docker and ...
 
OpenFlow @ Google
OpenFlow @ GoogleOpenFlow @ Google
OpenFlow @ Google
 
SFO15-110: Toolchain Collaboration
SFO15-110: Toolchain CollaborationSFO15-110: Toolchain Collaboration
SFO15-110: Toolchain Collaboration
 
Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...
Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...
Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
 
Docker in a big company
Docker in a big companyDocker in a big company
Docker in a big company
 
CV_Sudhindra Srinivasamurthy
CV_Sudhindra SrinivasamurthyCV_Sudhindra Srinivasamurthy
CV_Sudhindra Srinivasamurthy
 
Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Plany Konserwacji SQL Server dla żółtodziobów
Plany Konserwacji SQL Server dla żółtodziobówPlany Konserwacji SQL Server dla żółtodziobów
Plany Konserwacji SQL Server dla żółtodziobów
 
Webinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High AvailabilityWebinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High Availability
 
KoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginners
KoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginnersKoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginners
KoprowskiT_HUG-MSSQL_AdHocMaintenancePlansForBeginners
 
Vivek Resume
Vivek ResumeVivek Resume
Vivek Resume
 

Destaque

Complex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupComplex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupMárton Kodok
 
IBM Bluemix Nice meetup #5 - 20170504 - Container Service based on Kubernetes
IBM Bluemix Nice meetup #5 - 20170504 - Container Service based on KubernetesIBM Bluemix Nice meetup #5 - 20170504 - Container Service based on Kubernetes
IBM Bluemix Nice meetup #5 - 20170504 - Container Service based on KubernetesIBM France Lab
 
Docker security introduction-task-2016
Docker security introduction-task-2016Docker security introduction-task-2016
Docker security introduction-task-2016Ricardo Gerardi
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEODimitri Brunel
 
Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)
Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)
Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)Michelle Antebi
 
Monitoring and tuning your chef server - chef conf talk
Monitoring and tuning your chef server - chef conf talk Monitoring and tuning your chef server - chef conf talk
Monitoring and tuning your chef server - chef conf talk Andrew DuFour
 
Retelling nonfiction
Retelling nonfictionRetelling nonfiction
Retelling nonfictionEmily Kissner
 
JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...
JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...
JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...Leonardo De Moura Rocha Lima
 
Microsoft Microservices
Microsoft MicroservicesMicrosoft Microservices
Microsoft MicroservicesChase Aucoin
 
Do we need a bigger dev data culture
Do we need a bigger dev data cultureDo we need a bigger dev data culture
Do we need a bigger dev data cultureSimon Dittlmann
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsMartin Gutenbrunner
 
Amazon Elastic Block Store for Application Storage
Amazon Elastic Block Store for Application StorageAmazon Elastic Block Store for Application Storage
Amazon Elastic Block Store for Application StorageAmazon Web Services
 
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBig Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBigData_Europe
 
From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...
From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...
From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...DevOpsDays Tel Aviv
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTPintu Kabiraj
 
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2Gareth Chapman
 
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/20146 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014Christian Beedgen
 
Considerations for Operating An OpenStack Cloud
Considerations for Operating An OpenStack CloudConsiderations for Operating An OpenStack Cloud
Considerations for Operating An OpenStack CloudMark Voelker
 

Destaque (20)

IBM Containers- Bluemix
IBM Containers- BluemixIBM Containers- Bluemix
IBM Containers- Bluemix
 
Complex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupComplex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch Warmup
 
IBM Bluemix Nice meetup #5 - 20170504 - Container Service based on Kubernetes
IBM Bluemix Nice meetup #5 - 20170504 - Container Service based on KubernetesIBM Bluemix Nice meetup #5 - 20170504 - Container Service based on Kubernetes
IBM Bluemix Nice meetup #5 - 20170504 - Container Service based on Kubernetes
 
Docker security introduction-task-2016
Docker security introduction-task-2016Docker security introduction-task-2016
Docker security introduction-task-2016
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEO
 
Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)
Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)
Docker swarm-mike-goelzer-mv-meetup-45min-workshop 02242016 (1)
 
Monitoring and tuning your chef server - chef conf talk
Monitoring and tuning your chef server - chef conf talk Monitoring and tuning your chef server - chef conf talk
Monitoring and tuning your chef server - chef conf talk
 
Retelling nonfiction
Retelling nonfictionRetelling nonfiction
Retelling nonfiction
 
JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...
JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...
JavaOne 2017 - Choosing a NoSQL API and Database to Avoid Tombstones and Drag...
 
Microsoft Microservices
Microsoft MicroservicesMicrosoft Microservices
Microsoft Microservices
 
Do we need a bigger dev data culture
Do we need a bigger dev data cultureDo we need a bigger dev data culture
Do we need a bigger dev data culture
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environments
 
Spring Batch
Spring BatchSpring Batch
Spring Batch
 
Amazon Elastic Block Store for Application Storage
Amazon Elastic Block Store for Application StorageAmazon Elastic Block Store for Application Storage
Amazon Elastic Block Store for Application Storage
 
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBig Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
 
From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...
From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...
From 10 Users to 10 Milion in 10 Days - Adam Lev, Tamar Labs - DevOpsDays Tel...
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
 
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
 
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/20146 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
6 Million Ways To Log In Docker - NYC Docker Meetup 12/17/2014
 
Considerations for Operating An OpenStack Cloud
Considerations for Operating An OpenStack CloudConsiderations for Operating An OpenStack Cloud
Considerations for Operating An OpenStack Cloud
 

Semelhante a MyHeritage backend group - build to scale

MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101MongoDB
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Lucidworks
 
MySQL Options in OpenStack
MySQL Options in OpenStackMySQL Options in OpenStack
MySQL Options in OpenStackTesora
 
OpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackOpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackMatt Lord
 
MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0Ted Wennmark
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsDataStax
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersNiko Neugebauer
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
Handling Massive Writes
Handling Massive WritesHandling Massive Writes
Handling Massive WritesLiran Zelkha
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionLucidworks
 
Evolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital eraEvolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital eraVishal Puri
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?Deepak Shankar
 
Introduction of MariaDB AX / TX
Introduction of MariaDB AX / TXIntroduction of MariaDB AX / TX
Introduction of MariaDB AX / TXGOTO Satoru
 

Semelhante a MyHeritage backend group - build to scale (20)

MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
 
MySQL Options in OpenStack
MySQL Options in OpenStackMySQL Options in OpenStack
MySQL Options in OpenStack
 
OpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackOpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStack
 
MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
MySQL overview
MySQL overviewMySQL overview
MySQL overview
 
CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & Developers
 
Membase Meetup - Silicon Valley
Membase Meetup - Silicon ValleyMembase Meetup - Silicon Valley
Membase Meetup - Silicon Valley
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Handling Massive Writes
Handling Massive WritesHandling Massive Writes
Handling Massive Writes
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
 
Apache drill
Apache drillApache drill
Apache drill
 
Evolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital eraEvolution of Distributed Database Technologies in the Digital era
Evolution of Distributed Database Technologies in the Digital era
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
 
Introduction of MariaDB AX / TX
Introduction of MariaDB AX / TXIntroduction of MariaDB AX / TX
Introduction of MariaDB AX / TX
 

Último

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Último (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

MyHeritage backend group - build to scale

  • 1. Backend Group - build to scale Ran Levy, Backend Director ranl@myheritage.com
  • 2. Agenda • Introduction to MyHeritage • R&D structure to support scaling • R&D methodology to support scaling up • Scaling up technologies and solutions – Micro-services architecture – Relational DB scaling out – Data storing for low latency – SOLR scaling up – Queuing services – File servers – Caching services – Statistics services
  • 3. Family history for Families Building next generation tools for family history enthusiasts and their families Discover Preserve Share
  • 4. Challenge: Scale 77 million registered users 1.7 billion tree profiles in 27 million trees 6 billion historical records 200 million photos 42 languages 1 million daily emails
  • 5. R&D structure to support scaling up – guilds and band Band Master Missions Skill Guilds(ExpertiseandQuality) Bands (Delivery) Guild Manager Product Owner … Guild member
  • 6. R&D Methodology to support scaling up • Full continuous deployment –All developers are working on trunk –Commit triggers flow that ends in production update
  • 7. R&D Methodology to support scaling up • Procedure is backed up with: –Exposure flag (controlled by external UI) –Code reviews –Unit/integration tests (over 80% coverage) –Sensors for each released features. –Automatic logs and stats scanning
  • 8. R&D Methodology to support scaling up
  • 9. R&D Methodology to support scaling up
  • 10. R&D Methodology to support scaling up
  • 11. Agenda  Introduction to MyHeritage  R&D structure to support scaling  R&D methodology to support scaling up • Scaling up technologies and solutions – Micro-services architecture – Relational DB scaling out – Data storing for low latency – SOLR scaling up – Queuing services – File servers – Caching services – Statistics services
  • 12. Micro-services architecture • Monolithic code can’t scale for long – Localization of changes – Concurrency of development – Limits variety of coding languages – Scaling up specific services
  • 13. Micro-services architecture • Solution: – Micro-services architecture • Migration from monolithic code is gradual – Starting with isolated service – Gradual replacement of core services
  • 14. Relational DB scaling out techniques • Data sharding • Master – slaves • Using MySQL 5.5 Percona
  • 15. Relational DB scaling out techniques – approaches for data sharding • Consistent hashing based on key • Used for MyHeritage Historical Records (6B records) Func(ABCD) Read(ABCD)
  • 16. Relational DB scaling out techniques – approaches for data sharding • Consistent hashing pros & cons – Pros: • Supports high performance lookup • “Infinite scale” – Cons: • Re-sharding is not trivial and requires code change.
  • 17. Relational DB scaling out techniques – approaches for data sharding • Mapping table • Use case: Users’ data in MyHeritage Read(xyz) Read(XYZ) from Specific DB instance XYZ key lookup
  • 18. Relational DB scaling out techniques – approaches for data sharding • Mapping table pros & cons – Pros: • Easy re-sharding and scaling up. – Cons: • Requires DB lookup prior to data access. • Limited scalability.
  • 19. Relational DB scaling out techniques – Master Slave Active standby R/W flow R/O flow Master
  • 20. Data Storing for low latency • (Berkley DB, MapDB) • Cassandra – Account Store – People Store – (Counters system, A/B testing data)
  • 21. Data Storing for low latency – Account Store • Motivations – Access account data in sub 1 msec – High scale (~400M rows) – Online schema changes – Reduce OPEX – Linear Scaling out architecture
  • 22. Data Storing for low latency – Account Store • Solution: – Cassandra – Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, tuneable consistent, column-oriented database.
  • 23. Data Storing for low latency – Account Store • Cluster main characteristics: – 5 nodes, 500GB SSD, Replication Factor - 3 – Community Edition 2.0.13 • Very low maintenance (no repair –pr ) • Using counters • Using secondary indexes • Using VNodes for easier maintenance • Using SizeTieredCompactionStrategy compactions (writes optimized) • Achieved performance – Avg. local read latency: 0.108 ms – Avg. local write latency: 0.022 ms
  • 24. Data Storing for low latency – People Store (in progress) • Main Motivations – Access data rapidly • Avoiding the need to access multiple partitions – High scale (scaling to 2B rows)
  • 25. Search technologies • Motivations – Search billion of records in sub 200 msec. – Cope with differences: languages, spellings, inaccuracies, missing data. – Ranking of results.
  • 26. Search technologies • Solution: – SOLR – Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more.
  • 27. Search technologies - SOLR • Solr distributed search allows sharding a big index into smaller chunks running on multiple hosts. We do not utilize Solr 4’s SolrCloud feature. • Indexing: Client app is responsible to index each document on a specific shard (using some hashing of document ID) • Search: Client app sends request to aggregator Solr instance, which in turn queries all shards, and merges the results into one response (sort, paging) Index Shards: Application: Indexing Solr Solr Solr Indexing Search . . . Search Aggregator Solr
  • 28. Search technologies - SOLR • Indexing hits performance of searching • Split indexing to separate machines • Single points of failure: aggregator Load Balancer (HA Proxy) Solr SolrSolr Solr . . . SolrSolr Indexer Solr Indexer Solr . . . Searcher Solr Searcher Solr Replication Replication Indexing Indexer Solr Searcher Solr Replication Search SolrSolrAggregator Solr Load Balancer (HA Proxy) NULL Solr NULL SolrNULL Solr Static Resp.
  • 29. Queuing services • (In-house queue implementation) • (Beanstalkd) • Kafka – Kafka is a distributed, partitioned, replicated commit log service.
  • 30. Queuing System – Kafka High Level Overview Broker 1 Family Tree changes Topic part 1 part 2 part 32 Indexing Consumers RecordMatching Logstash reader Web Producers Daemons Face recog. Activity Topic part 1 part 2 part 32 DRBD replica Of Broker 2 Broker 2 Family Tree changes Topic part 1 part 2 part 32 DRBD replica Of Broker 1 … … … … Notifications sys. Notifications Topic Activity Topic part 1 part 2 part 32 … Notifications Topic
  • 31. Kafka @Myheritage – Consumers (Indexing) EventProcessor 1 Per consumer type, reader per partition Broker 2 Broker 1 EventProcessor EventProcessor IndexingQueue IndexingWorkers IndexingWorkers IndexingWorkers Fetch work SOLR Update item KafkaWatermark Get/update watermark Add event to queue
  • 32. File servers • Traditional – File Servers –~30 file servers –Total storage: 80 TB –HTTP(s) accessible with REST APIs
  • 33. File servers • CEPH – Use cases: • SEO serving • OpenStack – Version in production: FireFly – Using 40TB – Lessons learnt: • Do not use large buckets without index sharding (support from Hammer) • If you can’t use Hammer shard your buckets (or bad things WILL happen) • Don’t use the high density nodes
  • 34. Caching services • Memcached • In research: Memcached proxy • (CDN)
  • 35. Statistics Services • In-house MySQL • Graphite usage for Infrastructure • In research for app metrics: – Graphite over InfluxDB – Cyanite (Graphite over Cassandra) • Automated Anomaly Detection for infrastructure (Anodot)
  • 36. Logging Services • Central logging (including app logging + infrastructure): in-house in MySQL + ELK stack
  • 37. Ran Levy, Backend Director ranl@myheritage.com We are hiring!