O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Data Science Tutorial | What is Data Science? | Data Science For Beginners | Edureka

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Agenda
1. Need for Data Science
2. Walmart Use Case
3. Wha...

Vídeos do YouTube não são mais aceitos pelo SlideShare

Visualizar original no YouTube

DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science
Need For Data Science

Confira estes a seguir

1 de 47 Anúncio

Data Science Tutorial | What is Data Science? | Data Science For Beginners | Edureka

Baixar para ler offline

** Data Science Certification using R: https://www.edureka.co/data-science **
In this PPT on Data Science Tutorial, you’ll get an in-depth understanding of Data Science and you’ll also learn how it is used in the real world to solve data-driven problems. It’ll cover the following topics in this session:
Need for Data Science
Walmart Use case
What is Data Science?
Who is a Data Scientist?
Data Science – Skill set
Data Science Job roles
Data Life cycle
Introduction to Machine Learning
K- Means Use case
K- Means Algorithm
Hands-On
Data Science certification

Blog Series: http://bit.ly/data-science-blogs

Data Science Training Playlist: http://bit.ly/data-science-playlist

Follow us to never miss an update in the future.

Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka

** Data Science Certification using R: https://www.edureka.co/data-science **
In this PPT on Data Science Tutorial, you’ll get an in-depth understanding of Data Science and you’ll also learn how it is used in the real world to solve data-driven problems. It’ll cover the following topics in this session:
Need for Data Science
Walmart Use case
What is Data Science?
Who is a Data Scientist?
Data Science – Skill set
Data Science Job roles
Data Life cycle
Introduction to Machine Learning
K- Means Use case
K- Means Algorithm
Hands-On
Data Science certification

Blog Series: http://bit.ly/data-science-blogs

Data Science Training Playlist: http://bit.ly/data-science-playlist

Follow us to never miss an update in the future.

Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Data Science Tutorial | What is Data Science? | Data Science For Beginners | Edureka (20)

Anúncio

Mais de Edureka! (20)

Mais recentes (20)

Anúncio

Data Science Tutorial | What is Data Science? | Data Science For Beginners | Edureka

  1. 1. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Agenda 1. Need for Data Science 2. Walmart Use Case 3. What is Data Science? 4. Who is a Data Scientist? 5. Data Science – Skill Set 6. Data Science Job Roles 7. Data Life Cycle 8. Introduction to Machine Learning 9. K – Means Use Case 10. K – Means Algorithm 11. Hands - On 12. Data Science Certification
  2. 2. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Need For Data Science
  3. 3. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Sources Mobile Cloud Smart Car Evolution of Technology IOT Social Media Other factors Telephone Desktop Car
  4. 4. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Sources Evolution of Technology IOT Social Media Other factors
  5. 5. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Sources Evolution of Technology IOT Social Media Other factors 347,222 tweets1,736,111 pictures 204,000,000 emails 300 hours of video uploaded 4,166,667 likes & 200,000 photos 4,166,667 likes & 200,000 photos
  6. 6. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Sources Evolution of Technology IOT Social Media Other factors
  7. 7. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Walmart Use Case
  8. 8. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Analysis At Walmart Halloween and cookie sales Data scientist at Walmart found a connection between Halloween and the sales of cookies.
  9. 9. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Analysis At Walmart Hurricane and strawberry pop tarts Data scientist at Walmart found that sales of Strawberry pop-tarts increased by 7 times before a Hurricane.
  10. 10. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Analysis At Walmart Social media and cake pops Walmart is leveraging social media data to find about the trending products so that they can be introduced to the Walmart stores across the world
  11. 11. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What Is Data Science?
  12. 12. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What is Data Science? Data Science is the process of extracting knowledge and insights from data by using scientific methods. Scientific methods: Programming + Statistics + Business “Torture the data, and it will confess to anything.” ~ Ronald Coase, Economics, Nobel Prize
  13. 13. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Who Is A Data Scientist?
  14. 14. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Who Is A Data Scientist? Mathematics Business Technology
  15. 15. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Science – Skill Set Programming languagesStatistics Machine Learning Big Data processing frameworks Data wrangling & exploration Data visualisation Data extraction & processing
  16. 16. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Science Job Roles
  17. 17. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Science Job Roles Data Scientist Data Analyst Data Architect Data Engineer Statistician Database Administrator Business Analyst Data & Analytics Manager
  18. 18. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Science Life Cycle
  19. 19. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Life Cycle Data Science Business requirements Data acquisition Data processing Data exploration Modelling Deployment
  20. 20. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Life Cycle Understand the problem Identify central objectives Identify variables that need to be predicted Business requirements Data acquisition Data Processing Data exploration Modelling Deployment
  21. 21. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Life Cycle Business requirements Data acquisition Data Processing Data exploration Modelling Deployment What data do I need for my project? What are the data sources? How can I obtain the data? What is the most efficient way to store and access all of it?
  22. 22. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Life Cycle Business requirements Data acquisition Data Processing Data exploration Modelling Deployment Transform data into desired format Data cleaning • Missing values • Corrupted data • Remove unnecessary data
  23. 23. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Life Cycle Business requirements Data acquisition Data Processing Data exploration Modelling Deployment understand the patterns in the data Retrieve useful insight form hypotheses
  24. 24. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Life Cycle Business requirements Data acquisition Data Processing Data exploration Modelling Deployment Determine optimal data features for the machine-learning model Create a model that predicts the target most accurately Evaluate & test the efficiency of the model
  25. 25. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Life Cycle Business requirements Data acquisition Data Processing Data exploration Modelling Deployment Check the deployment environment for dependency issues Deploy the model in a pre- production/ test environment Monitor the performance
  26. 26. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Introduction To Machine Learning
  27. 27. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science What Is Machine Learning? Machine learning is a subset of artificial intelligence (AI) which provides machines the ability to learn automatically & improve from experience without being explicitly programmed. They look the same! Cherry Apple Orange Data Algorithm
  28. 28. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Types Of Machine Learning Reinforcement LearningSupervised Learning Unsupervised Learning
  29. 29. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Use Case
  30. 30. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Brain Tumour Detection Using K - means Brain tumour segmentation deals with the implementation of the k-means algorithm for detection of range and shape of tumour in brain MR images. K-Means clustering is an unsupervised learning algorithm used to partition a dataset into k clusters in which each data point belongs to the cluster with the nearest mean.
  31. 31. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm
  32. 32. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence ➢Randomly initialize k points called the cluster centroids. Here, k = 2 ➢Value of k(number of clusters) can be determined by the elbow curve.
  33. 33. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence ➢Compute the distance between the data points and the cluster centroid initialized. ➢Depending upon the minimum distance, data points are divided into two groups. 1 2 Euclidean distance Cluster centroid
  34. 34. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence ➢Compute mean of red dots & reposition red cluster centroid to this mean ➢Compute mean of green dots & reposition green cluster centroid to this mean. 1 2
  35. 35. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence 1 2 ➢Repeat previous two steps iteratively till the cluster centroids stop changing their positions.
  36. 36. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence 1 2 ➢Repeat previous two steps iteratively till the cluster centroids stop changing their positions.
  37. 37. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence 1 2 ➢Repeat previous two steps iteratively till the cluster centroids stop changing their positions.
  38. 38. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence 1 2 ➢Repeat previous two steps iteratively till the cluster centroids stop changing their positions.
  39. 39. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm Initialization Cluster assignment Move centroid Optimization Convergence 1 2 ➢Finally, k-means clustering algorithm converges. ➢Divides the data points into two clusters clearly visible in red and green.
  40. 40. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science K – Means Algorithm ➢ Data Matrix ➢ Distance/ dissimilarity Matrix
  41. 41. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Hands - On
  42. 42. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Data Science Certification
  43. 43. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Edureka’s Data Science Certification
  44. 44. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science Edureka’s Data Science Certification Introduction to Data Science Statistical Inference Data extraction, wrangling & exploration Introduction to Machine Learning Classification techniques Unsupervised Learning Recommender engine Text Mining Time seriesDeep Learning
  45. 45. DATA SCIENCE CERTIFICATION TRAINING www.edureka.co/data-science WebDriver vs. IDE vs. RC ➢ Data Warehouse is like a relational database designed for analytical needs. ➢ It functions on the basis of OLAP (Online Analytical Processing). ➢ It is a central location where consolidated data from multiple locations (databases) are stored.

×