O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Data science presentation 2nd CI day

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Data science presentation
Data science presentation
Carregando em…3
×

Confira estes a seguir

1 de 24 Anúncio

Data science presentation 2nd CI day

Baixar para ler offline

A presentation delivered by Mohammed Barakat on the 2nd Jordanian Continuous Improvement Open Day in Amman. The presentation is about Data Science and was delivered on 3rd October 2015.

A presentation delivered by Mohammed Barakat on the 2nd Jordanian Continuous Improvement Open Day in Amman. The presentation is about Data Science and was delivered on 3rd October 2015.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Data science presentation 2nd CI day (20)

Anúncio

Mais recentes (20)

Data science presentation 2nd CI day

  1. 1. CIJ is Sponsored By: Career of Future 10/13/2015 1
  2. 2. About ME Mohammed K. Barakat • Industrial Engineer, The University of Jordan • Business Excellence Manager-FINE Hygienic Paper Company • Professional Engineer in Industrial Engineering (PE), (JCPQA-JEA) • Project Management Professional (PMP), (PMI) • Risk Management Professional (PMI-RMP), (PMI) • Certified Six Sigma Black Belt (CSSBB), (ASQ) • Certified Six Sigma Green Belt (CSSGB), (ASQ) • Microsoft Certified Technology Specialist (MCTS), (Microsoft) • Microsoft Certified Trainer (MCT), (Microsoft) mohammedbarakat MohdBarakat MohdKBarakat 10/13/2015 2
  3. 3. Data Science: Career of the Future 10/13/2015 3 http://www.wired.com/insights/2014/06/tell-kids-data-scientists-doctors/ …Did you hear that? Data scientists earning more than doctors… …But salary is not the only reason… …data scientists will have a measurable impact on the future of healthcare.
  4. 4. Why Data Science? 10/13/2015 4 http://www.economist.com/node/15579717 …the quantity of information in the world is soaring …150 exabytes (billion gigabytes) of data in 2005. This year, it will create 1,200 exabytes… …keeping up with this flood, and storing the bits that might be useful, is difficult enough… …Analyzing it, to spot patterns and extract useful information, is harder.. …Even so, the data deluge is already starting to transform business…
  5. 5. Why “Data Scientist” is a hugely important profession in the next decade? 10/13/2015 5 “I keep saying that the sexy job in the next 10 years will be statisticians,” said Hal Varian, chief economist at Google. “And I’m not kidding.” https://www.youtube.com/watch?v=pi472Mi3VLw
  6. 6. Why “Data Scientist” is a hugely important profession in the next decade? • …ability to take the data 10/13/2015 6 • …extract value from it • …understand the process • …visualize it • …Not only at the professional level • …communicate it • …Ubiquitous data…but • …Statisticians are just part of it • …Scarcity in ability to understand data and extract value from it • …Managers need to access and understand the data themselves • …No army behind the scenes to digest the information for you
  7. 7. What is Data Science? 10/13/2015 7 “Data Science is the extraction of knowledge from large volumes of data that are structured or unstructured” often requires sorting through a great amount of information and writing algorithms to extract insights from this data.
  8. 8. What is Big Data? 10/13/2015 8 Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization." The 3V’s of Big Data: Volume: amount of data Velocity: speed of data in and out Variety: range of data type and sources
  9. 9. The Data Science Process 10/13/2015 9
  10. 10. The Data Scientist Toolbox 10/13/2015 10 R Software a software environment for statistical computing and graphics
  11. 11. The Data Scientist Toolbox 10/13/2015 11 RStudio An open source software to make it easy for anyone to analyze data with R
  12. 12. The Data Scientist Toolbox 10/13/2015 12 You’ve got to do a lot of coding!
  13. 13. The Data Scientist Toolbox 10/13/2015 13 You’ve got to work out a lot of statistics!
  14. 14. The Data Scientist Toolbox 10/13/2015 14 Github.com RPubs.com Share your results and code Publish your full report and build a personal Brand
  15. 15. The Data Scientist Toolbox 10/13/2015 15 RPubs.com You’d be a Data Scientist… …..evidence-based results …..reproducible research
  16. 16. The Data Science process explained 10/13/2015 16 STEP 1: Getting and Cleaning Data  Downloading files  Reading data  Raw vs. Tidy data  Merging data  Reshaping data  Summarizing data  Data ‘Housekeeping’
  17. 17. The Data Science process explained 10/13/2015 17 STEP 2: Exploratory Data Analysis  understand data properties  find patterns in data  communicate results  It is made quickly  Many are made  The goal is for personal understanding
  18. 18. The Data Science process explained 10/13/2015 18 STEP 3: Perform Statistical Inference “Statistical inference is the process of drawing formal conclusions from data”. Some techniques and concepts:  Sampling  Randomization  Hypothesis Testing  Confidence Intervals (uncertainty)  Experimental Design
  19. 19. The Data Science process explained 10/13/2015 19 STEP 4: Perform Regression Modelling “a statistical process for estimating the relationships among variables”  understand how the value of the dependent variable changes when any one of the independent variables is varied.  widely used for prediction (next step)
  20. 20. The Data Science process explained 10/13/2015 20 STEP 5: Perform Machine Learning “is a computer's way of learning from examples by using algorithms that take in data and improve themselves to predict on new data” Example: The spam filter working in the background to block your junk email.
  21. 21. The Data Science process explained 10/13/2015 21 STEP 6: Make your research Reproducible “Make analytic data and code available so that others may reproduce findings” Why?! To provide scientific evidence of your findings. http://www.rpubs.com/mohammedkb/TransMPGAnalysis
  22. 22. What it takes you to be a good Data Scientist 10/13/2015 22 Business skills Communications skills Analytical skills Computer science Statistics Creativity Scientific Mindset Passion & Perseverance
  23. 23. What to do next? 10/13/2015 23  Start learning about Data Science  Go to the Massive Open Online Course (MOOC) o Coursera/Data Science o DataCamp
  24. 24. 10/13/2015 24

×