SlideShare a Scribd company logo
1 of 43
Download to read offline
1
How Kroger Embraced a
“Schema First”
Philosophy in Building
Real-time Data Pipelines
Rob Hammonds Rob Hoeting Lauren McDonald
460,000
Associates
Company-Wide
$121.2
Billion
2018 Total Sales
2,764
Supermarkets &
Multi-Department
Stores
Serving
Customers in
35
States and The
District of Columbia
We are evolving!
This is just a data
swamp.
My report just broke!
How do I use the data?
Where do I find the data?
I want this new business
feature by Friday.
Let's use AI and streaming
analytics to solve all our
problems!
I’m spending too
much time spraying
data to everyone!
I had to roll back my
prod release because
my data change broke
someone!
Event Streaming Platform
Tenets
Thou shalt democratize data
Thou shalt model business processes
Thou shalt have a high developer
experience
Avro to the
rescue!
Schema-First
Development
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "StoreID",
"type": "int"
},
{
"name": "Location",
"type": "string"
}
]
}
StoreCreated.avsc StoreCreated.newBuilder()
.setStoreName(“Kroger-Cincinnati”)
.setStoreID(“513”)
.setLocation(“1014 Vine Street”)
.build()
Serialization
Framework
Ugh…How do I makes sense out of all this?
Let’s make the events
composable!
Address: {
Street, City, State, ZipCode
}
Person: {
firstName,
lastName,
address: Address,
}
Role: {
reportsTo
title
}
EmployeeHiredEvent {
employee: Person
role: Role
startDate: date
salary: float
}
Class Generation and
Publishing
Register the Schema
Compatibility Checking
Bootstrap all the things!
#LazyDevelopersAreTheBestDevelopers
My pipeline
isn’t
registering the
schemas
I register
schemas but
my jar isn’t
being
published
I’m failing
compatibility
but I HAVE to
change this
field name
The bootstrap
script isn’t
working for
me!
Producer
v.1
Producer
v.2
Avro Files
v.1
Avro Files
v.2
Consumer
v.1
Consumer
v.2
Schema v.1 Schema v.2
Full Compatibility
Backward
Compatibility
Forward
Compatibility
Schema Evolution
Schema Editor
Event Schema Lifecycle
V1
V2
V2
TIME
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "SID",
"type": "int"
}
]
}
StoreCreated.avsc – v1
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "SID",
"type": "int"
},
{
"name": "Location",
"type": "string",
"default": "Cincinnati"
}
]
}
StoreCreated.avsc – v2
PRODUCTION
Schema Registry
Full Compatibility
STAGE
Schema Registry
Full Compatibility
DEVELOPMENT
Schema Registry
Full Compatibility
Schema Editor
Event Schema Lifecycle
V1
V2
V2
TIME
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "SID",
"type": "int"
}
]
}
StoreCreated.avsc – v1
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "SID",
"type": "int"
},
{
"name": "Location",
"type": "string",
"default": "Cincinnati"
}
]
}
StoreCreated.avsc – v2
V3
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "StoreID",
"type": "int"
},
{
"name": "Location",
"type": "string",
"default": "Cincinnati"
}
]
}
StoreCreated.avsc – v3
Changing the name of an
attribute breaks compatibility!
PRODUCTION
Schema Registry
Full Compatibility
STAGE
Schema Registry
Full Compatibility
DEVELOPMENT
Schema Registry
Full Compatibility
I accidently named a
field wrong. Seems
rather harsh, don’t you
think?
I’m blocked.
Hey Robs and Lauren,
can you delete this
schema and clear my
topic?
Schema Editor
Event Schema Lifecycle
V1
V2
TIME
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "StoreID",
"type": "int"
},
{
"name": "Location",
"type": ”string"
}
]
}
V2 V2 V3
StoreCreated.avsc – v2
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [
{
"name": "StoreName",
"type": "string"
},
{
"name": "StoreID",
"type": "int"
},
{
"name": "Location",
"type": "com.kroger.commons.Location"
}
]
}
StoreCreated.avsc – v3
You can’t change the type of an
attribute!
PRODUCTION
Schema Registry
Full Compatibility
STAGE
Schema Registry
Full Compatibility
DEVELOPMENT
Schema Registry
Full Compatibility
How do we protect clients
in production, but improve
the DevX of schema
development?
Schema Editor
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [{
"name": "StoreName",
"type": "string"
}, {
"name": "StoreID",
"type": "int"
},{
"name": "Location",
"type": ”com.kroger.commons.Location"
} }
StoreCreated.avsc – v2.0.0
TIME
Major Version Compatibility (MVC)
2.0.0
PRODUCTION
Schema Registry
Full Transitive
Compatibility
STAGE
Schema Registry
No Compatibility
DEVELOPMENT
Schema Registry
No Compatibility
2.0.0
2.0.0
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [{
"name": "StoreName",
"type": "string"
}, {
"name": "StoreID",
"type": "int"
},{
"name": "Location",
"type": "com.kroger.commons.Location"
},{
"name": "StoreManager",
"type": "com.kroger.commons.Person",
"doc": "Manager of the store"
} }
StoreCreated.avsc – v2.0.1
2.0.1
When adding an attribute, make
it nullable and add a default!
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0
Major Version Compatibility (MVC)
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [{
"name": "StoreName",
"type": "string"
}, {
"name": "StoreID",
"type": "int"
},{
"name": "Location",
"type": "com.kroger.commons.Location"
},{
"name": "StoreManager",
"type": ["com.kroger.commons.Person",
"null"],
"default": "null",
"doc": "Manager of the store"
} }
StoreCreated.avsc – v2.0.1
PRODUCTION
Schema Registry
Full Transitive
Compatibility
STAGE
Schema Registry
No Compatibility
DEVELOPMENT
Schema Registry
No Compatibility
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0 2.1.1
Major Version Compatibility (MVC)
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [{
"name": "Store",
"type": "string"
}, {
"name": "StoreID",
"type": "int"
},{
"name": "Location",
"type": ”com.kroger.commons.Location"
},{
"name": "StoreManager",
"type": ["com.kroger.commons.Person",
"null"],
"default": "null",
"doc": "Manager of the store"
} }
StoreCreated.avsc – v2.1.1
You can’t change the name of an
attribute, but you can add an alias!
PRODUCTION
Schema Registry
Full Transitive
Compatibility
STAGE
Schema Registry
No Compatibility
DEVELOPMENT
Schema Registry
No Compatibility
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0
Major Version Compatibility (MVC)
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [{
"name": "StoreName",
"alias" : "Store",
"type": "string"
}, {
"name": "StoreID",
"type": "int"
},{
"name": "Location",
"type": " com.kroger.commons.Location"
},{
"name": "StoreManager",
"type": ["com.kroger.commons.Person",
"null"],
"default": "null",
"doc": "Manager of the store”
} }
StoreCreated.avsc – v2.1.1
2.1.1
PRODUCTION
Schema Registry
Full Transitive
Compatibility
STAGE
Schema Registry
No Compatibility
DEVELOPMENT
Schema Registry
No Compatibility
Schema Editor
TIME
2.0.0
2.0.1
2.1.0
2.0.0
2.0.0
2.1.0
Major Version Compatibility (MVC)
{
"type": "record",
"name": "Store",
"doc": "This is an avro schema for a store",
"namespace": "com.kroger",
"fields": [{
"name": "StoreName",
"alias": "Store",
"type": "string"
}, {
"name": "StoreID",
"type": "int"
},{
"name": "Location",
"type": "com.kroger.commons.Location"
},{
"name": "StoreManager",
"type": ["com.kroger.commons.Person",
"null"],
"default": "null",
"doc": "Manager of the store"
} }
StoreCreated.avsc – v2.1.1
2.1.1
PRODUCTION
Schema Registry
Full Transitive
Compatibility
STAGE
Schema Registry
No Compatibility
DEVELOPMENT
Schema Registry
No Compatibility
Lets automate all
the things…
#lazydevelopersarethebestdevelopers
What are we improving again?
Cross-platform Schema Development
Schema Composition
Schema Registration
JAR building
Schema Lifecycle
Development
Schema Registry
Stage
Schema Registry
Production
Schema Registry
ELMR
Global Schema and
Metadata Store
ELMR
Artifactory
JAR files
CI/CD Build Server
Java Plugin
ELMR – Event Lifecycle Management Repository
Event Streaming Becomes a Thing
The Business
No problem, I just sent
you a 500 line Avro
schema file which has
everything you need.
Can you help me
understand all the
fields in those
inventory events?
How can these
people live like this?
How we
improved data
discovery...
#LazyDevelopersAreTheBestDevelopers
ELMR
UI
Development
Schema Registry
Stage
Schema Registry
Production
Schema Registry
ELMR
Global Schema and
Metadata Store
ELMR
Artifactory
JAR files
CI/CD Build Server
Java Plugin
ELMR-UI
Artifact
Discovery
Development
Schema Registry
Stage
Schema Registry
Production
Schema Registry
ELMR
Global Schema and
Metadata Store
ELMR
Artifactory
JAR files
CI/CD Build Server
Java Plugin
ELMR-UI
Event Discovery & Socialization
Event Lifecycle Management
CI/CD Automation
Avro Standard
Events Streaming
A Review of our Journey
What’s next?
Future Ideas
INTUITIVE SCHEMA
EDITING
GLOBAL STRUCTURES
AND FIELDS
EXTENDED ATTRIBUTES
AND VALIDATION
Thank You!!!
@RHammonds1 @RobHoeting @Lew181818

More Related Content

More from confluent

Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 
The Journey to Data Mesh with Confluent
The Journey to Data Mesh with ConfluentThe Journey to Data Mesh with Confluent
The Journey to Data Mesh with Confluentconfluent
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performanceconfluent
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Replyconfluent
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Diveconfluent
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloudconfluent
 
Partner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and UpgradePartner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and Upgradeconfluent
 
Confluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKConfluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKconfluent
 
Real-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public SectorReal-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public Sectorconfluent
 

More from confluent (20)

Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 
The Journey to Data Mesh with Confluent
The Journey to Data Mesh with ConfluentThe Journey to Data Mesh with Confluent
The Journey to Data Mesh with Confluent
 
Citi Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and PerformanceCiti Tech Talk: Monitoring and Performance
Citi Tech Talk: Monitoring and Performance
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
Citi Tech Talk Disaster Recovery Solutions Deep Dive
Citi Tech Talk  Disaster Recovery Solutions Deep DiveCiti Tech Talk  Disaster Recovery Solutions Deep Dive
Citi Tech Talk Disaster Recovery Solutions Deep Dive
 
Citi Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid CloudCiti Tech Talk: Hybrid Cloud
Citi Tech Talk: Hybrid Cloud
 
Partner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and UpgradePartner Tech Talk Q3: Q&A with PS - Migration and Upgrade
Partner Tech Talk Q3: Q&A with PS - Migration and Upgrade
 
Confluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKConfluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIK
 
Real-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public SectorReal-time Streaming for Government and the Public Sector
Real-time Streaming for Government and the Public Sector
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

How Kroger embraced a "schema first" philosophy in building real-time data pipelines (Rob Hoeting, Rob Hammonds& Lauren McDonald, Kroger ) Kafka Summit SF 2019

  • 1. 1 How Kroger Embraced a “Schema First” Philosophy in Building Real-time Data Pipelines
  • 2. Rob Hammonds Rob Hoeting Lauren McDonald
  • 3. 460,000 Associates Company-Wide $121.2 Billion 2018 Total Sales 2,764 Supermarkets & Multi-Department Stores Serving Customers in 35 States and The District of Columbia
  • 5. This is just a data swamp. My report just broke! How do I use the data? Where do I find the data? I want this new business feature by Friday. Let's use AI and streaming analytics to solve all our problems! I’m spending too much time spraying data to everyone! I had to roll back my prod release because my data change broke someone!
  • 6. Event Streaming Platform Tenets Thou shalt democratize data Thou shalt model business processes Thou shalt have a high developer experience
  • 8. Schema-First Development { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": "string" } ] } StoreCreated.avsc StoreCreated.newBuilder() .setStoreName(“Kroger-Cincinnati”) .setStoreID(“513”) .setLocation(“1014 Vine Street”) .build() Serialization Framework
  • 9. Ugh…How do I makes sense out of all this?
  • 10. Let’s make the events composable! Address: { Street, City, State, ZipCode } Person: { firstName, lastName, address: Address, } Role: { reportsTo title } EmployeeHiredEvent { employee: Person role: Role startDate: date salary: float }
  • 11. Class Generation and Publishing Register the Schema Compatibility Checking
  • 12. Bootstrap all the things! #LazyDevelopersAreTheBestDevelopers
  • 13.
  • 14. My pipeline isn’t registering the schemas I register schemas but my jar isn’t being published I’m failing compatibility but I HAVE to change this field name The bootstrap script isn’t working for me!
  • 15. Producer v.1 Producer v.2 Avro Files v.1 Avro Files v.2 Consumer v.1 Consumer v.2 Schema v.1 Schema v.2 Full Compatibility Backward Compatibility Forward Compatibility Schema Evolution
  • 16. Schema Editor Event Schema Lifecycle V1 V2 V2 TIME { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" } ] } StoreCreated.avsc – v1 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" }, { "name": "Location", "type": "string", "default": "Cincinnati" } ] } StoreCreated.avsc – v2 PRODUCTION Schema Registry Full Compatibility STAGE Schema Registry Full Compatibility DEVELOPMENT Schema Registry Full Compatibility
  • 17. Schema Editor Event Schema Lifecycle V1 V2 V2 TIME { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" } ] } StoreCreated.avsc – v1 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "SID", "type": "int" }, { "name": "Location", "type": "string", "default": "Cincinnati" } ] } StoreCreated.avsc – v2 V3 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": "string", "default": "Cincinnati" } ] } StoreCreated.avsc – v3 Changing the name of an attribute breaks compatibility! PRODUCTION Schema Registry Full Compatibility STAGE Schema Registry Full Compatibility DEVELOPMENT Schema Registry Full Compatibility
  • 18. I accidently named a field wrong. Seems rather harsh, don’t you think? I’m blocked. Hey Robs and Lauren, can you delete this schema and clear my topic?
  • 19. Schema Editor Event Schema Lifecycle V1 V2 TIME { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": ”string" } ] } V2 V2 V3 StoreCreated.avsc – v2 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [ { "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" }, { "name": "Location", "type": "com.kroger.commons.Location" } ] } StoreCreated.avsc – v3 You can’t change the type of an attribute! PRODUCTION Schema Registry Full Compatibility STAGE Schema Registry Full Compatibility DEVELOPMENT Schema Registry Full Compatibility
  • 20. How do we protect clients in production, but improve the DevX of schema development?
  • 21. Schema Editor { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": ”com.kroger.commons.Location" } } StoreCreated.avsc – v2.0.0 TIME Major Version Compatibility (MVC) 2.0.0 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility 2.0.0 2.0.0 { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": "com.kroger.commons.Location" },{ "name": "StoreManager", "type": "com.kroger.commons.Person", "doc": "Manager of the store" } } StoreCreated.avsc – v2.0.1 2.0.1 When adding an attribute, make it nullable and add a default!
  • 22. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": "com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store" } } StoreCreated.avsc – v2.0.1 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  • 23. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 2.1.1 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "Store", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": ”com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store" } } StoreCreated.avsc – v2.1.1 You can’t change the name of an attribute, but you can add an alias! PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  • 24. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "alias" : "Store", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": " com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store” } } StoreCreated.avsc – v2.1.1 2.1.1 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  • 25. Schema Editor TIME 2.0.0 2.0.1 2.1.0 2.0.0 2.0.0 2.1.0 Major Version Compatibility (MVC) { "type": "record", "name": "Store", "doc": "This is an avro schema for a store", "namespace": "com.kroger", "fields": [{ "name": "StoreName", "alias": "Store", "type": "string" }, { "name": "StoreID", "type": "int" },{ "name": "Location", "type": "com.kroger.commons.Location" },{ "name": "StoreManager", "type": ["com.kroger.commons.Person", "null"], "default": "null", "doc": "Manager of the store" } } StoreCreated.avsc – v2.1.1 2.1.1 PRODUCTION Schema Registry Full Transitive Compatibility STAGE Schema Registry No Compatibility DEVELOPMENT Schema Registry No Compatibility
  • 26. Lets automate all the things… #lazydevelopersarethebestdevelopers
  • 27. What are we improving again? Cross-platform Schema Development Schema Composition Schema Registration JAR building Schema Lifecycle
  • 28. Development Schema Registry Stage Schema Registry Production Schema Registry ELMR Global Schema and Metadata Store ELMR Artifactory JAR files CI/CD Build Server Java Plugin ELMR – Event Lifecycle Management Repository
  • 30.
  • 32. No problem, I just sent you a 500 line Avro schema file which has everything you need. Can you help me understand all the fields in those inventory events?
  • 33. How can these people live like this?
  • 36. Development Schema Registry Stage Schema Registry Production Schema Registry ELMR Global Schema and Metadata Store ELMR Artifactory JAR files CI/CD Build Server Java Plugin ELMR-UI
  • 37.
  • 39. Development Schema Registry Stage Schema Registry Production Schema Registry ELMR Global Schema and Metadata Store ELMR Artifactory JAR files CI/CD Build Server Java Plugin ELMR-UI
  • 40. Event Discovery & Socialization Event Lifecycle Management CI/CD Automation Avro Standard Events Streaming A Review of our Journey
  • 42. Future Ideas INTUITIVE SCHEMA EDITING GLOBAL STRUCTURES AND FIELDS EXTENDED ATTRIBUTES AND VALIDATION