O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Introduction to Data Science

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Data science
Data science
Carregando em…3
×

Confira estes a seguir

1 de 29 Anúncio

Introduction to Data Science

Baixar para ler offline

In this presentation, I have talked about Big Data and its importance in brief. I have included the very basics of Data Science and its importance in the present day, through a case study. You can also get an idea about who a data scientist is and what all tasks he performs. A few applications of data science have been illustrated in the end.

In this presentation, I have talked about Big Data and its importance in brief. I have included the very basics of Data Science and its importance in the present day, through a case study. You can also get an idea about who a data scientist is and what all tasks he performs. A few applications of data science have been illustrated in the end.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Introduction to Data Science (20)

Anúncio

Mais recentes (20)

Introduction to Data Science

  1. 1. INTRODUCTION TO DATA SCIENCE Presented by: Srishti
  2. 2. How Much Data Does The World Generate Every Minute? "Over 2.5 quintillion bytes of data are created every single day, and it’s only going to grow from there. By 2020, it’s estimated that 1.7MB of data will be created every second for every person on earth.” - DOMO’s Data Never Sleeps 6.0
  3. 3. Big Data Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis.
  4. 4. Why is Big Data important? The importance of big data doesn’t revolve around how much data you have, but what you do with it. Big data – and the way organizations manage and derive insight from it – is changing the way the world uses business information.
  5. 5. Who uses Big Data? Big data affects organizations across practically every industry. Banking Education Healthcare Government Manufacturing Retail
  6. 6. What is Data Science? Data science is a multidisciplinary blend of data inference, algorithm development, and technology in order to solve analytically complex problems.
  7. 7. Fields Of Data Science Data Science covers a wide spectrum of domains. It uses various AI, Machine Learning and Deep Learning methodologies in order to analyse data and extract useful insights from it.
  8. 8. Lifecycle of Data Science
  9. 9. Case Study: Diabetes Prevention What if we could predict the occurrence of diabetes and take appropriate measures beforehand to prevent it?
  10. 10. Step-1 Discovery First, we will collect the data based on the medical history of the patient. Attributes: npreg – Number of times pregnant glucose – Plasma glucose concentration bp – Blood pressure skin – Triceps skinfold thickness bmi – Body mass index ped – Diabetes pedigree function age – Age income – Income Sample Data
  11. 11. Step-2 Data Preparation Now, once we have the data, we need to clean and prepare the data for data analysis. This data has a lot of inconsistencies like missing values, blank columns, abrupt values and incorrect data format which need to be cleaned.
  12. 12. So, we will clean and pre- process this data by removing the outliers, filling up the null values and normalizing the data type. (data preprocessing) Finally, we get the clean data as shown below which can be used for analysis.
  13. 13. Step-3 Model Planning Now let’s do some analysis discussed earlier in Phase 3. First, we will load the data into the analytical sandbox and apply various statistical functions on it. Then, we use visualization techniques like histograms, line graphs, box plots to get fair idea of the distribution of data.
  14. 14. Step-4 Model Building Now, based on insights derived from the previous step, the best fit for this kind of problem is the decision tree. Decision tree models are very robust as we can use different combination of attributes to make various trees and then finally implement the one with the maximum efficiency.
  15. 15. Step-5 Operationalize In this phase, we will run a small pilot project to check if our results are appropriate. We will also look for performance constraints if any. If the results are not accurate, then we need to replan and rebuild the model. Step-6 Communicate Results Once we have executed the project successfully, we will share the output for full deployment.
  16. 16. How to choose algorithms ?
  17. 17. Who is a Data Scientist ? Data scientists are a new breed of analytical data expert who have the technical skills to solve complex problems – and the curiosity to explore what problems need to be solved.
  18. 18.  They’re part mathematician, part computer scientist and part trend- spotter.  Many data scientists began their careers as statisticians or data analysts. Where did they come from?
  19. 19. Data Scientist: Master of all trades!
  20. 20. Tasks of a Data Scientist Collecting large amounts of data and analyzing it. Using data-driven techniques for solving business problems. Communicating the results to business and IT leaders. Spotting trends, patterns, and relationships within data. Converting data into compelling visualizations. Working with Artificial Intelligence and Machine Learning techniques. Deploying text analytics and data preparation.
  21. 21. How to become a Data Scientist: Roadmap
  22. 22. Data Science: Applications
  23. 23. Why Is Data Science In Demand ? While companies realize the value and power of Big Data, they thrive to use it to make better business decisions. Companies are facing challenges with the handling of data Shortage of skilled resources Hard to find multi-factors Entry barriers for other professionals The pay is great A Plethora of Roles
  24. 24. The demand for data scientists will rise by 28% by 2020 alone. More and more industries are becoming data hungry and they need data to hold specialized data scientists who can craft products for the customers. About 11.5 Million jobs will be created by 2026 according to U.S. Bureau of Labor Statistics.
  25. 25. References www.wikipedia.com https://www.zarantech.com/blog/top-10-applications- of-data-science/ https://towardsdatascience.com/big-data-and-data- science/ https://data-flair.training/blogs/data-science-case- studies/ https://www.edureka.co/blog/what-is-data-science/
  26. 26. THANK YOU !

×