Amid the increasingly competitive brewing industry, the ability of retailers and brewers to provide optimal product assortments for their consumers has become a key goal for business stakeholders. Consumer trends, regional heterogeneities and massive product portfolios combine to scale the complexity of assortment selection. At AB InBev, we approach this selection problem through a two-step method rooted in statistical learning techniques. First, regression models and collaborative filtering are used to predict product demand in partnering retailers. The second step involves robust optimization techniques to recommend a set of products that enhance business-specified performance indicators, including retailer revenue and product market share.
With the ultimate goal of scaling our approach to over 100k brick-and-mortar retailers across the United States and online platforms, we have implemented our algorithms in custom-built Python libraries using Apache Spark. We package and deploy production versions of Python wheels to a hosted repository for installation to production infrastructure.
To orchestrate the execution of these processes at scale, we use a combination of the Databricks API, Azure App Configuration, Azure Functions, Azure Event Grid and some custom-built utilities to deploy the production wheels to on-demand and interactive Databricks clusters. From there, we monitor execution with Azure Application Insights and log evaluation metrics to Databricks Delta tables on ADLS. To create a full-fledged product and deliver value to customers, we built a custom web application using React and GraphQL which allows users to request assortment recommendations in a self-service, ad-hoc fashion.
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Building A Product Assortment Recommendation Engine
1. Building a Product Assortment
Recommendation Engine for
Brick-and-Mortar Retailers
Justin Morse, Staff Data Scientist, AB InBev
Ethan DuBois, Senior Software Engineer, AB InBev
2. Agenda
§ Introductions and overview
§ The problem: product
assortment selection
§ The algorithmic solution
§ Deploying the solution
§ Lessons learned
Justin Morse,
Staff Data
Scientist
Ethan DuBois,
Senior Software
Engineer
5. Pivoting towards a tech-oriented approach
LOLA team
launched
(5 employees)
Incorporate
Databricks into
workflows
Begin
R&D partnership
with Bud Lab & MIT
National
launch of first
microservice
Begin
development of
sku-level
recommendation engine
BeerTech
Organization
launched
(73 employees)
2018 2019 2020 2021
Launch
recommendation
engine pilot
6. Which products should a retailer carry?
An average retailer has >10100 ways to select their product
assortment.
7. How can we develop a quantitative approach to
assortment planning that accounts for customer
preferences, business priorities, and computational
complexity?
8. Assortment Recommendation Pipeline
Product Demand
Prediction
Make quantitative estimates
of product demand for each
partnering retailer
Select the best product
assortment given business
requirements and estimated
product demand
Assortment
Optimization
Data Model
Transform datasets into a
format required for our
pipeline
Causal Analysis
Measure the effects of our
modeling interventions
9. Predicting demand for products in partnering retailers
• Custom built library for
family of discrete
choice models using
PyTorch
• Executed on Databricks
clusters with Azure
functions
• Next steps: scale
training with Petastorm
and Horovod
10. Optimizing retailer performance
• Use traditional
numerical techniques
to optimize revenue
objective function
• Include filters related to
allowable business
outcomes:
- Size restrictions
- Inventory restrictions
- License restrictions
11. • Recommendation
engine launched in
partnering retailers
in the Ontario region
• Currently working
with software
engineering team to
scale solution for
North American and
Global launch
Demonstrating value through small-scale pilots
13. Scaling and deploying the solution
• Production quality code standards
• Best-practice Code Distribution
• Repository-based, version-controlled, automated CI/CD
• Flexible and lightweight configuration approach
• Decoupled communication between components
• Infrastructure-as-code
• Ability to scale infra up and down as necessary to meet demand
• API for integration with other applications
After a number of successful pilots, we needed to build a more robust solution that at minimum included:
14. Scaling and deploying the solution
▪ Production quality code
standards
▪ Best-practice code
distribution
▪ Repository-based, version-
controlled, automated CI/CD
• Flexible/lightweight
• Decoupled from code
• Infrastructure-as-code
• Configuration
• Code
• Decoupled communication
between components
• Ability to automatically and
programmatically scale infra
up or down to meet demand
• API for integration with other
applications
• Orchestration
15. Scaling and deploying the solution: Technologies
• Configuration
• Code • Orchestration
Azure App
Configuration
Azure
Key Vault
Azure App
Insights
Azure
Event Grid
Azure Function Apps
16. Code: Refactoring ML Processes
Moving from chained Notebooks to end-to-end Pipelines in Python
• Chained Notebooks
• Didn’t provide the ease of maintenance and visibility that we wanted
• Easy to get lost, added complexity
• Difficult to standardize, scan, control quality across workstreams
• Process-controlled Python Pipelines
• Object-oriented approach
• Make use of shared tools and utilities
• Ability to package and distribute more easily
• CI/CD integration with Github workflows (Code scanning, unit/integration tests, etc)
20. Code: Packaging and Deployment
• Custom Python wheels
• Object-oriented, following best practices approach
• Built and deployed in GitHub Workflows as part of CI/CD
• Distribution: JFrog Artifactory Repository
• Organizational PyPI repo
• Available for installation on all clusters or machines
• Authentication set up with cluster init scripts stored in DBFS
• Roadmap: Move to GitHub Packages once PyPI is supported :’(
Packaging and deploying code to an easily accessible repository for installation on production resources
33. Conclusion
• MVP Released, in production
• Collecting initial user feedback in preparation for future releases
• Lessons Learned
• Development Process: db-connect vs notebooks, pros and cons
• Configuration: moving configs out of code wherever possible
• Pandas vs PySpark: understanding the distinction and implications
• Future Roadmap
• Increased parallelization/distribution for both model training and optimization process
• Added intelligence throughout service: Job progress and ETAs, different Demand Estimate universes
• Enhanced DevOps approach to cloud resource deployment and environment management
34. Emmanuel Doro
Justin Morse
Phillip Theron
Gui Neubern
Zi Wang
Senthil
Murugappan
Ethan DuBois
Ravi Kolla Sarosh Ahmad
Griffin Ansel
Ashish Baiju
Chris Stone
Nelson Kandeya
Emily Shapiro
Jessica Zou
Vivek Farias Nikos Trichakis
Tianyi Peng Patricio Foncea
DS
DS
DS
DS
DS
DS SE
SE
SE SE
SE
DE
DE
P
P
Lucas Diffey
DE