SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
How Zhaopin built its Event Center
using Apache Pulsar
Penghui Li
Sijie Guo
Zhaopin.com
Zhaopin.com is the biggest online recruitment service provider
in China
Zhaopin.com provides job seekers a comprehensive resume service, latest
employment, and career development related information, as well as in-depth online
job search for positions throughout China
Zhaopin.com provides professional HR services to over 2.2 million clients and its
average daily page views are over 68 million.
Who are we
Penghui Li
-Tech lead of infrastructure team at zhaopin.com
-5+ years of experiences developing message
queues and microservices
-Apache Pulsar Committer
Who are we
Sijie Guo
-Apache Pulsar Committer & PMC Member
-Apache BookKeeper Committer & PMC Member
-Interested in technologies around Event Streaming
-Worked for Twitter and Yahoo before
1. Why building an Event Center
2. Why Apache Pulsar
3. Apache Pulsar at Zhaopin
4. Streaming Platform
5. Zhaopin’s contributions to Apache Pulsar
Why building an Event Center
Data Silos -> Unified Platform
Data Silos
To Enterprises
MSMQ
To End Users
RabbitMQ
Data Processing
Kafka
• High Maintenance Cost
• Extremely hard to share data cross
teams
• Inconsistency between data silos
• Doesn’t Scale
• No consistent SLA
Pain Points
Data Silos
To Enterprises
MSMQ
To End Users
RabbitMQ
Data Processing
Kafka
• High Maintenance Cost
• Extremely hard to share data cross
teams
• Inconsistency between data silos
• Doesn’t Scale
• No consistent SLA
Pain Points
Unification - MQService
Thrift
RabbitMQ RabbitMQ RabbitMQ
HTTP MQTT
Submission ServiceResume ServiceJob Search
MQService
RabbitMQ RabbitMQ
• Simplified Operations
• Scale-out Service
• High availability
Problems Solved:
• Keep messages for longer period
• Data rewind
• Order Guarantee
Problems Unsolved:
Unification - MQService
Online Services
MQService
Data Processing
Kafka
0
Consumer-1 Consumer-2 Consumer-3 New consumer
0
Queue Partition-0 Partition-1 Partition-2
1
2
3
0 1 2 3
1
2
3
0
1
2
3
0
1
2
3
Consumer-1
0,1,2,3
Consumer-1 Consumer-1 New consumer
0,1,2,3 0,1,2,3
Better consumption parallelism Better order guarantee
Why Building an Event Center
Why Building an Event Center
RabbitMQ is better for work queue use cases, more consumers can increase
consumption. Kafka need more partitions to increase consumption.
We used RabbitMQ a lot for work queue use cases.
Why Building an Event Center
Kafka integrates well with the data processing ecosystem (Flink, Spark),
and provides high throughput.
We used Kafka a lot for data processing.
Why Building an Event Center
The cost of operating two different message systems is high
Data sits at two different silos
But
We need a unified platform to handle both scenarios
Why Apache Pulsar
Pulsar == Messaging + Storage
What is Apache Pulsar
“Flexible Pub/Sub messaging
backed by durable log/stream storage”
Apache Pulsar - Multi Tenancy
Apache Pulsar - Queue + Streaming
Apache Pulsar - Cloud Native
• Independent Scalability
• Instant Failure Recovery
• Balance-free on cluster
expansions
Layered Architecture
Why Apache Pulsar
1. Pulsar provides a better abstraction of consumption patterns
2. Pulsar provides better fault tolerance and consistency options
3. Pulsar uses a scalable storage system (Apache Bookkeeper)
4. Hierarchical topic management and resource isolation
Perfect match with our requirement.
Apache Pulsar at Zhaopin
20+ core services, 6 billions msgs/day
Unification - Apache Pulsar
Online Services
Apache Pulsar
• No Data Silos
• Queue + Streaming
• Disaster Recovery
• Infinite Message Storage (via Tiered Storage)
• Data rewinding
Problem Solved:
Data Processing
Queue Streaming
Milestones
POC
2018/07 2018/09
Pulsar on Production
2018/10
Pulsar based Event Center

1 billion msgs/day
2018/11
Win the best innovative
platform award at Zhaopin
2018/12
3 billion msgs/day
2019/02
6 billion msgs/day
Core Metrics
50+ Namespaces
3000+ Topics
6+ billion Messages per day
3TB Storage per day
20+ Core Services
System Metrics
Latency 99.5% < 5msWrite 100K+/s Read 200K+/s Network In 190MB+/s Network Out 550MB+/s
Pulsar at Zhaopin
1. One copy of data, single source-of-truth.
2. Don’t worry about data consistency between RabbitMQ and Kafka
3. Multi-tenancy makes topic management easier
4. Strong data durability allows us to stop worrying about message
loss
Streaming Platform
Beyond an Event Center
Streaming Platform
Pulsar
S3
HiveFlink Pulsar SQL
HDFS OSS
Steaming Layer
Tiered Storage
Stream to Stream
Stream -> Table
Table -> Stream
Stream -> Stream
Stream -> Stream
Table -> Table
Unified Data Processing
Hive
Topic Topic Topic Topic
Stream Processing
Contribute to Apache Pulsar
Zhaopin’s Contributions to Pulsar
Client interceptors
We use this feature to track message between producer and consumers
Dead Letter Topic
Time partitioned message tracker
Service url provider
We use this feature to dynamically switching traffic
Hive Pulsar integration
Muti-version Schema and more…
Thank you

Mais conteúdo relacionado

Último

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Último (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Destaque

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Destaque (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

How Zhaopin built its Event Center using Apache Pulsar

  • 1. How Zhaopin built its Event Center using Apache Pulsar Penghui Li Sijie Guo
  • 2. Zhaopin.com Zhaopin.com is the biggest online recruitment service provider in China Zhaopin.com provides job seekers a comprehensive resume service, latest employment, and career development related information, as well as in-depth online job search for positions throughout China Zhaopin.com provides professional HR services to over 2.2 million clients and its average daily page views are over 68 million.
  • 3. Who are we Penghui Li -Tech lead of infrastructure team at zhaopin.com -5+ years of experiences developing message queues and microservices -Apache Pulsar Committer
  • 4. Who are we Sijie Guo -Apache Pulsar Committer & PMC Member -Apache BookKeeper Committer & PMC Member -Interested in technologies around Event Streaming -Worked for Twitter and Yahoo before
  • 5. 1. Why building an Event Center 2. Why Apache Pulsar 3. Apache Pulsar at Zhaopin 4. Streaming Platform 5. Zhaopin’s contributions to Apache Pulsar
  • 6. Why building an Event Center Data Silos -> Unified Platform
  • 7. Data Silos To Enterprises MSMQ To End Users RabbitMQ Data Processing Kafka • High Maintenance Cost • Extremely hard to share data cross teams • Inconsistency between data silos • Doesn’t Scale • No consistent SLA Pain Points
  • 8. Data Silos To Enterprises MSMQ To End Users RabbitMQ Data Processing Kafka • High Maintenance Cost • Extremely hard to share data cross teams • Inconsistency between data silos • Doesn’t Scale • No consistent SLA Pain Points
  • 9. Unification - MQService Thrift RabbitMQ RabbitMQ RabbitMQ HTTP MQTT Submission ServiceResume ServiceJob Search MQService RabbitMQ RabbitMQ • Simplified Operations • Scale-out Service • High availability Problems Solved: • Keep messages for longer period • Data rewind • Order Guarantee Problems Unsolved:
  • 10. Unification - MQService Online Services MQService Data Processing Kafka
  • 11. 0 Consumer-1 Consumer-2 Consumer-3 New consumer 0 Queue Partition-0 Partition-1 Partition-2 1 2 3 0 1 2 3 1 2 3 0 1 2 3 0 1 2 3 Consumer-1 0,1,2,3 Consumer-1 Consumer-1 New consumer 0,1,2,3 0,1,2,3 Better consumption parallelism Better order guarantee Why Building an Event Center
  • 12. Why Building an Event Center RabbitMQ is better for work queue use cases, more consumers can increase consumption. Kafka need more partitions to increase consumption. We used RabbitMQ a lot for work queue use cases.
  • 13. Why Building an Event Center Kafka integrates well with the data processing ecosystem (Flink, Spark), and provides high throughput. We used Kafka a lot for data processing.
  • 14. Why Building an Event Center The cost of operating two different message systems is high Data sits at two different silos But We need a unified platform to handle both scenarios
  • 15. Why Apache Pulsar Pulsar == Messaging + Storage
  • 16. What is Apache Pulsar “Flexible Pub/Sub messaging backed by durable log/stream storage”
  • 17. Apache Pulsar - Multi Tenancy
  • 18. Apache Pulsar - Queue + Streaming
  • 19. Apache Pulsar - Cloud Native • Independent Scalability • Instant Failure Recovery • Balance-free on cluster expansions Layered Architecture
  • 20. Why Apache Pulsar 1. Pulsar provides a better abstraction of consumption patterns 2. Pulsar provides better fault tolerance and consistency options 3. Pulsar uses a scalable storage system (Apache Bookkeeper) 4. Hierarchical topic management and resource isolation Perfect match with our requirement.
  • 21. Apache Pulsar at Zhaopin 20+ core services, 6 billions msgs/day
  • 22. Unification - Apache Pulsar Online Services Apache Pulsar • No Data Silos • Queue + Streaming • Disaster Recovery • Infinite Message Storage (via Tiered Storage) • Data rewinding Problem Solved: Data Processing Queue Streaming
  • 23. Milestones POC 2018/07 2018/09 Pulsar on Production 2018/10 Pulsar based Event Center
 1 billion msgs/day 2018/11 Win the best innovative platform award at Zhaopin 2018/12 3 billion msgs/day 2019/02 6 billion msgs/day
  • 24. Core Metrics 50+ Namespaces 3000+ Topics 6+ billion Messages per day 3TB Storage per day 20+ Core Services
  • 25. System Metrics Latency 99.5% < 5msWrite 100K+/s Read 200K+/s Network In 190MB+/s Network Out 550MB+/s
  • 26. Pulsar at Zhaopin 1. One copy of data, single source-of-truth. 2. Don’t worry about data consistency between RabbitMQ and Kafka 3. Multi-tenancy makes topic management easier 4. Strong data durability allows us to stop worrying about message loss
  • 28. Streaming Platform Pulsar S3 HiveFlink Pulsar SQL HDFS OSS Steaming Layer Tiered Storage
  • 29. Stream to Stream Stream -> Table Table -> Stream Stream -> Stream Stream -> Stream Table -> Table
  • 30. Unified Data Processing Hive Topic Topic Topic Topic Stream Processing
  • 32. Zhaopin’s Contributions to Pulsar Client interceptors We use this feature to track message between producer and consumers Dead Letter Topic Time partitioned message tracker Service url provider We use this feature to dynamically switching traffic Hive Pulsar integration Muti-version Schema and more…