2. Types Of Organizations
• Product
- Stable business context
- Implementation constraints
- Data Science solution to align with a product road map
- Test and control
- Staged Roll outs
- Opportunity to create IP
3. Types Of Organization
Service
• Problem requirement understanding
• Solution to be signed off from stakeholders
• Constant Engagement
• Lesser constraint on implementation
• Data is not standardized
• Data Validation is a must and signed off
• Limited IP creation
4. The People In A Data Science Project
• Lead Data Scientist
• Data Scientists
• Engagement Managers
• Account Manager
• Sales
• Platform Owners
• Engineering team
• Design Team
5. Problem
Requirement
Consulting on
Opportunities
Solutioni
ng
Identifying
success
metric and
signing on
expected
outputs
Reportin
g inputs
Implementatio
n decisions
Roll out
discussions
Randomization
understanding
Lead Data Scientist
Client Managers
Client
Lead Data Scientist
Client Managers
Client
Project Manager
Product or Platform manager
Design
Process Before Data
60 % of time spent on this step
6. Process Continued After Data
40 % Of Time Spent On This Step
Lead Data
Scientist
Data Scientist
Model
Governance
Board
Lead Data Scientist
Client Managers
Client & Client Sponsors
Project Manager
Product or Platform
manager
Governan
ce and
model
approvals
Model go
live
Measure
ment
Roll out
Data
understandi
ng
Data
Validation
Models
Lead Data Scientist
Data Scientist
Optimization
Lead Data Scientist
Data Scientist
Project Manager
Product or Platform
manager
Design
7. Problem/Requirement
• Very vague to very structured
Eg: Can you build me a recommendation engine
Eg: What would you to do increase sales in our website?
Eg: Lets build a prediction model with responder as the
dependent variable
8. Consulting layer
• Ask questions – to see if the problem is the real pain or is there
something else?
• For eg, “When someone wants to build a recco engine all they
want is a better engagement in the app”.
• Industry is with Buzz words – getting the real solution means
understanding the true problem..
9. Consulting layer
• What is the type of implementation the customer wants – is it
real time like an app or some notification or is it an insight
• Is there a product roadmap that needs to be aligned to
• What kind of modelling tool kit (Python/Scala/R) etc can run in
the deployment environment
10. Solution Blue Print
• Make a solution blue print
• Walk through with stake holders – Product, Engg, client teams
etc
• Make sure there are no gaps
• IP can be filed possibly at this stage
11. Identifying success metric
• Define control group
• Define a bench mark
• Make sure benchmark does not change with time and is truly
neutral
• Arrive at the Formula for incremental revenues or incremental
sales etc
• Sign off on this with stakeholders
• Attribution is a key factor.
12. Attribution Problems
• For eg: Number of chats to an agent may go down when a
chatbot is launched. But that does not mean chat was handled
well. What if the chat bot had not routed failed questions to
human agents?
• Another possibility because of the chat bot on the web page
more customers may come out and try the chat bot and hence
may increase the # of chats
13. Reports
• What should the measurement report contain?
• What are the metrics?
• Things like conversion how do you track?
• Does hanging up on an IVR means resolved?
• Reporting frequency – what it should be?
• Should report tally with any other existing system
14. Implementation Decisions
• The platform and its support
• External data may need to be pumped into the
platform
• Does the platform get real time data and have the
ability to process real time models
• What is the real time support the platform has
• What is the level of complexity the platform provides?
• How should the data science model be delivered?
• Who is going to support the models?
• Design team/Ux and their role
How should the model be handed over?. Who is to take ownership of delivery
and maintenance
15. Roadmap for Rollouts
• What is the roll out roadmap?
• What should be the gates?
For eg: E.comm roll outs would be % of websites. In some
organization it could be market based roll outs. There could be
certain customer groups or segments that could be part of roll
out
16. Randomization Decisions
• How is randomness ascertained
-- Browser session id vs Visitor session id for e.comm
--Customer id based random groups
-- it could be callers randomized on caller id
-- what is the system that ensures randomness?
17. Data
• Understand data
• Understand data distributions especially the dependent
variable
• Make sanity checks to ensure distributions are in line with
problem statement
• Make sure data at this stage that gets used in the modelling
process is what is available real time or during model execution
19. Modeling Process - Contd
Train, Test and
Out of time
validation
Algorithm
Tuning
Model
iterations
Final Model
Governance
approval
Model
handover
20. Model Go Live And Optimization
• Iterative process
• Validate reports and measure model performance at a
definite frequency
• Fine tune the model – add more data
• Capture more features – instrument or use additional data
• Optimize till model stabilizes and revenue or target is met
21. Optimization - Design
• There could also be design optimizations during this stage –
For eg the content in a widget or number of stages in a check
out
• Performance of the model in terms of execution could also be
optimized
• It could be offers at this stage that could be optimized
• Workflows in BOTS could be optimized as well
22. Sign Off Model Performance By Stakeholders
• Stakeholders to sign off and buy in to the lift generated
• Resolution of any attribution conflict
• If any seasonality or other effects is showing up in model
performance, those gets resolved at this stage. Some cases the
test or pilot period is extended
23. Full Roll Out
• Model is rolled out to the max possible extent
• Revenues need to be realized on an on-going process
• Additional opportunities can be sought
24. Patents And IP
• Lots of IP gets generated during a modelling process
• IP – novelty, context to the business, increase defensibility
• Patent review committee
• IDF – Provisional –Queries – Grants (Could take >3 years from filing)
• Expensive process
• Any idea is a great idea. Always discuss with the patent lawyer
• Simple variable could be a competitive differentiator.
• Algorithms are not patented. Methods and processes are
25. Case Study
NPS of telco giant Telx has been dipping. DecX is a
product and services org that is engaging with
Telx to improve NPS.
As the chief scientist of DecX what would you do?.
Where do you see the NPS going down?. What
does DecX do that can bring up the NPS of TelX