SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
Abed Ajraou – Director of Data & Insights
& Lead Data Scientist
@First Utility
Putting Data Science in
Your Business: a First
Utility Feedback
First Utility – Putting customers in control; saving them money
Cheaper tariffs Great service More knowledge
Driving the Success of DS Solutions : Skills, Roles and Responsibilities
Source: https://whatsthebigdata.com/2016/05/01/data-scientists-spend-most-of-their-time-cleaning-data/
What have we missed here … ?
Right Technology
Data – THE NEW POWER
Individual
Transaction-Level Data
Industry
Data
Internal Data
Data & Insights
Platform
Delivering
Business Values
for our clients
Data for Products and
Operational Process
Data for Dashboarding
and Business Decisions
Data for Predictive
Analytics
Allow us to deliver a
better service for our
customers
Allow us to optimise the
business and give the
better price to our
customers
Allow us to give more
knowledge to our
customers
Industry
Data
Individual
Transaction-Level Data Internal Data
 Better Agility
 Data Lake and Data Warehousing in the
same platform
 Enable Data Discovery
 Collect more data
 Analyse the data with high performance
 Next Gen of Data Visualisation on top of
Hadoop
Right Mind-set
Start with a business problem
Not considering the business outcome, it’s actually
the first reason of project failure!
Start with a business problem
Starting with the data and not with the question … ?
Right Methodology
Explore the data
● Exploratory Analysis by Visualizing the data
The creativity part and lot
of trial / error process.
Feature engineering
Andrew Fogg win the competition
by categorising the colours of cars.
● ML is often used in DS
● Currently, the buzz/trend ML is xgboost which gives most of the
time better result than the traditional Random Forest & Neural
Networks.
● Reason of the success? More Accurate, more efficient, easy to
use, customized and distributed.
● Need less spending time in Feature engineering but still need
some creativity.
Models to predict
Models to predict: gradient boosting
● ML is often used in DS
● Currently, the buzz/trend ML is xgboost which gives most of the
time better result than the traditional Random Forest & Neural
Networks.
● Reason of the success? More Accurate, more efficient, easy to
use, customized and distributed.
● Need less spending time in Feature engineering but still need
some creativity.
Models to predict
Evaluation - validations
● Overfitting/Underfitting
is the biggest fear of a
Data Scientist.
● Cross validation is one
way to protect the
model to not overfit
Feedback loop
● ML algorithm is a life system …
like any life specimen, it needs cares !!!
● Learning by his mistakes, it’s the only way
to progress and to fit a real AI model.
Bad Methodology
Main reasons:
• No clear business case
• Try to create the best accurate model in the first place
• No agility
• No code version control
An iterative delivery is key
Sprint 1
Sprint 2
Main take away:
• Agility is required
• Weekly delivered is highly recommended to avoid
falling to the “tunnel effect”
Going forward: AML
Automated
Machine learning
Gartner Says
“More Than 40
Percent of Data
Science Tasks Will
Be Automated by
2020”
Source: https://www.gartner.com/newsroom/id/3570917
Automation in Machine Learning is starting
Gain in Efficiency
● In the old age of BI world, we gain in efficiency by using ETL tool
rather than scripting codes.
However, ML is often associate with R/Python/Scala coding.
Dataiku Flow => enable AML
My favorite app
The Collaborative Data Science Platform: Dataiku
Data Science
is nothing
without a team
Data Science is a range of skills !
It’s quite rare to get them in a single person
Source: Dsradar.com
Thank you
for your attention
Any Questions?
Keep contact:
@AAjraou

Mais conteúdo relacionado

Mais procurados

43948_HPE Big Data Svcs infographic final
43948_HPE Big Data Svcs infographic final43948_HPE Big Data Svcs infographic final
43948_HPE Big Data Svcs infographic final
JoleneDobbin
 
Online Research Ireland Survey Results
Online Research Ireland Survey ResultsOnline Research Ireland Survey Results
Online Research Ireland Survey Results
ashley_campbell_RN
 

Mais procurados (20)

Leaware company presentation
Leaware company presentationLeaware company presentation
Leaware company presentation
 
Think Fast, Think Small Keynote from PAW San Fran 2014
Think Fast, Think Small Keynote from PAW San Fran 2014Think Fast, Think Small Keynote from PAW San Fran 2014
Think Fast, Think Small Keynote from PAW San Fran 2014
 
0940 diamondsponsor de
0940 diamondsponsor de0940 diamondsponsor de
0940 diamondsponsor de
 
Artificial Intelligence - 3 Weeks to Success
Artificial Intelligence - 3 Weeks to SuccessArtificial Intelligence - 3 Weeks to Success
Artificial Intelligence - 3 Weeks to Success
 
Notilyze SAS
Notilyze SASNotilyze SAS
Notilyze SAS
 
Evaluation of big data analysis
Evaluation of big data analysisEvaluation of big data analysis
Evaluation of big data analysis
 
Customer intelligence - Building stronger relationship with your customers
Customer intelligence - Building stronger relationship with your customersCustomer intelligence - Building stronger relationship with your customers
Customer intelligence - Building stronger relationship with your customers
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
 
Predicting employee burnout
Predicting employee burnoutPredicting employee burnout
Predicting employee burnout
 
Why Analytics and to what level - By Novoniel Deb
Why Analytics and to what level - By Novoniel DebWhy Analytics and to what level - By Novoniel Deb
Why Analytics and to what level - By Novoniel Deb
 
Giovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDrivenGiovanni Lanzani GoDataDriven
Giovanni Lanzani GoDataDriven
 
Chief Data Officer: Customer Analytics Innovation
Chief Data Officer: Customer Analytics InnovationChief Data Officer: Customer Analytics Innovation
Chief Data Officer: Customer Analytics Innovation
 
Abn amro altares Marijne le Comte
Abn amro altares Marijne le ComteAbn amro altares Marijne le Comte
Abn amro altares Marijne le Comte
 
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya SatyamoorthyH2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
H2O World - NCS Continuous Media Optimization w/H2O - Satya Satyamoorthy
 
Own the future
Own the futureOwn the future
Own the future
 
NLB Data Analytics Overview
NLB Data Analytics OverviewNLB Data Analytics Overview
NLB Data Analytics Overview
 
43948_HPE Big Data Svcs infographic final
43948_HPE Big Data Svcs infographic final43948_HPE Big Data Svcs infographic final
43948_HPE Big Data Svcs infographic final
 
Conflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big DataConflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big Data
 
9 IMPORTANCE OF BIG DATA CERTIFICATION
9 IMPORTANCE OF BIG DATA CERTIFICATION9 IMPORTANCE OF BIG DATA CERTIFICATION
9 IMPORTANCE OF BIG DATA CERTIFICATION
 
Online Research Ireland Survey Results
Online Research Ireland Survey ResultsOnline Research Ireland Survey Results
Online Research Ireland Survey Results
 

Semelhante a SDD2017 - 03 Abed Ajraou - putting data science in your business a first utility feedback

Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
mark madsen
 
Challenges of Executing AI
Challenges of Executing AIChallenges of Executing AI
Challenges of Executing AI
Dr. Umesh Rao.Hodeghatta
 

Semelhante a SDD2017 - 03 Abed Ajraou - putting data science in your business a first utility feedback (20)

Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
 
Demystifying ML/AI
Demystifying ML/AIDemystifying ML/AI
Demystifying ML/AI
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Ezml Stanford 2015
Ezml Stanford 2015Ezml Stanford 2015
Ezml Stanford 2015
 
Analytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, HyderabadAnalytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, Hyderabad
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
How to classify documents automatically using NLP
How to classify documents automatically using NLPHow to classify documents automatically using NLP
How to classify documents automatically using NLP
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Webinar: AI as a Shared Service by Salesforce Senior Director of Product
Webinar: AI as a Shared Service by Salesforce Senior Director of ProductWebinar: AI as a Shared Service by Salesforce Senior Director of Product
Webinar: AI as a Shared Service by Salesforce Senior Director of Product
 
Real World End to End machine Learning Pipeline
Real World End to End machine Learning PipelineReal World End to End machine Learning Pipeline
Real World End to End machine Learning Pipeline
 
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
[DSC Europe 22] The Making of a Data Organization - Denys Holovatyi
 
AI as a Shared Service by Salesforce Senior Director of Product
AI as a Shared Service by Salesforce Senior Director of ProductAI as a Shared Service by Salesforce Senior Director of Product
AI as a Shared Service by Salesforce Senior Director of Product
 
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
 
Challenges of Executing AI
Challenges of Executing AIChallenges of Executing AI
Challenges of Executing AI
 
Big data and Marketing by Edward Chenard
Big data and Marketing by Edward ChenardBig data and Marketing by Edward Chenard
Big data and Marketing by Edward Chenard
 
#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...
#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...
#MarketingShake - Edward Chenard - Descubrí el poder del Big Data para Transf...
 

Último

Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 

Último (20)

Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 

SDD2017 - 03 Abed Ajraou - putting data science in your business a first utility feedback

  • 1. Abed Ajraou – Director of Data & Insights & Lead Data Scientist @First Utility Putting Data Science in Your Business: a First Utility Feedback
  • 2. First Utility – Putting customers in control; saving them money Cheaper tariffs Great service More knowledge
  • 3. Driving the Success of DS Solutions : Skills, Roles and Responsibilities
  • 6. Data – THE NEW POWER Individual Transaction-Level Data Industry Data Internal Data Data & Insights Platform Delivering Business Values for our clients Data for Products and Operational Process Data for Dashboarding and Business Decisions Data for Predictive Analytics Allow us to deliver a better service for our customers Allow us to optimise the business and give the better price to our customers Allow us to give more knowledge to our customers
  • 7. Industry Data Individual Transaction-Level Data Internal Data  Better Agility  Data Lake and Data Warehousing in the same platform  Enable Data Discovery  Collect more data  Analyse the data with high performance  Next Gen of Data Visualisation on top of Hadoop
  • 9. Start with a business problem Not considering the business outcome, it’s actually the first reason of project failure!
  • 10. Start with a business problem
  • 11. Starting with the data and not with the question … ?
  • 13. Explore the data ● Exploratory Analysis by Visualizing the data
  • 14. The creativity part and lot of trial / error process. Feature engineering Andrew Fogg win the competition by categorising the colours of cars.
  • 15. ● ML is often used in DS ● Currently, the buzz/trend ML is xgboost which gives most of the time better result than the traditional Random Forest & Neural Networks. ● Reason of the success? More Accurate, more efficient, easy to use, customized and distributed. ● Need less spending time in Feature engineering but still need some creativity. Models to predict
  • 16. Models to predict: gradient boosting
  • 17. ● ML is often used in DS ● Currently, the buzz/trend ML is xgboost which gives most of the time better result than the traditional Random Forest & Neural Networks. ● Reason of the success? More Accurate, more efficient, easy to use, customized and distributed. ● Need less spending time in Feature engineering but still need some creativity. Models to predict
  • 18. Evaluation - validations ● Overfitting/Underfitting is the biggest fear of a Data Scientist. ● Cross validation is one way to protect the model to not overfit
  • 19. Feedback loop ● ML algorithm is a life system … like any life specimen, it needs cares !!! ● Learning by his mistakes, it’s the only way to progress and to fit a real AI model.
  • 20. Bad Methodology Main reasons: • No clear business case • Try to create the best accurate model in the first place • No agility • No code version control
  • 21. An iterative delivery is key Sprint 1 Sprint 2 Main take away: • Agility is required • Weekly delivered is highly recommended to avoid falling to the “tunnel effect”
  • 23. Gartner Says “More Than 40 Percent of Data Science Tasks Will Be Automated by 2020” Source: https://www.gartner.com/newsroom/id/3570917 Automation in Machine Learning is starting
  • 24. Gain in Efficiency ● In the old age of BI world, we gain in efficiency by using ETL tool rather than scripting codes. However, ML is often associate with R/Python/Scala coding.
  • 25. Dataiku Flow => enable AML My favorite app The Collaborative Data Science Platform: Dataiku
  • 27. Data Science is a range of skills ! It’s quite rare to get them in a single person Source: Dsradar.com
  • 28. Thank you for your attention Any Questions? Keep contact: @AAjraou