O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Data analytics career path

Carregando em…3

Confira estes a seguir

1 de 45 Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Data analytics career path (20)


Mais recentes (20)

Data analytics career path

  1. 1. Data/Analytics Career Paths Eng. Ahmed Amr ahmed.amr@rubikal.com
  2. 2. Road Map ● Defining Data Science. ● Data Science Marketplace. ● Required Skills for Data Science. ● Data Science Career Paths. ● Day in the life of Data Scientist.
  3. 3. Data Science, hype/reality? “Data Scientist: The Sexiest Job of the 21st Century” – Thomas H. Davenport and D.J. Patil “Analytics is defined as the scientific process of transforming data into insight for making better decisions.” – The Institute for Operations Research and the Management Sciences (INFORMS) “With more and more companies using big data, the demand for data analytic specialists,—sometimes called data scientists, who know how to manage the tsunami of information, spot patterns within it and draw conclusions and insights—is nearing a frenzy.” – Chris Morris, CNBC
  4. 4. Data Scientist ● “A person who is better at statistics than any software engineer and better at software engineering than any statistician.” - Josh Wills- Director of Data Science at Cloudera ● “Data scientists are inquisitive: exploring, asking questions, doing “what if” analysis, questioning existing assumptions and processes. Armed with data and analytical results, a top-tier data scientist will then communicate informed conclusions and recommendations across an organization’s leadership structure. ” - Anjul Bhambhri, IBM
  5. 5. Defining Data Science History
  6. 6. Role of Computer Science Empowering Statistics Solving a wide practical problems by providing number of crunching and massive storage.
  7. 7. Inventions Accelerating the pace of the marriage between Statistics and Data Science. 1960s Database Management Systems (DBMS) 1970s Relational DBMS
  8. 8. Knowledge Discovery and Data Mining Late 1980s Terms like Knowledge Discovery and Data Mining started to be used widely.
  9. 9. Big Data Early 1990s Explosion of business data. 1997 Official start of the term big data.
  10. 10. Data Science Late 1990s The phrase data science first appeared to inspire professionals to harness the power of data by effectively analyzing them and producing useful intelligence. Statistician is replaced by data scientist.
  11. 11. Analytics Mid 2000s The word analytics was adopted by data scientists to emphasize the fact that an increasing number of companies started to heavily rely on the statistical and quantitative analysis of data as well as predictive modeling to make informed decisions so that they can compete better with other businesses.
  12. 12. Defining Data Science Enabling Technologies
  13. 13. 1-Data Infrastructure Technologies ● Support how data is : 1) Shared. 2) Processed. 3) Consumed. ● Distributed Computing and Cloud Computing. ○ Virtualization and distributed file sharing.
  14. 14. Distributed Computing ● An approach to break down a task into smaller pieces that are easier to process. ● Each element in the task is assigned to a processor which could be geographically dispersed. ● A software is necessary to manage all aspects of distributed computing. ○ i.e. Hadoop
  15. 15. Cloud Computing ● Platform to support distributed computing. ● A bunch of computers housed in data centers. ● Can be used as an easy hardware for distributed computing.
  16. 16. 2-Data Management Technologies ● Data Management is handled by DBMS. ● Data Science requires highly scalable, reliable, efficient ways to store, manage and process data. ● Structure and unstructured data.
  17. 17. 3-Visualization Technologies ● Acquired insights need to be conveyed to leadership of an organization. ● Effective communication with non-experts. ● Responsible for increasing the impact of the data science project results.
  18. 18. Data Science Marketplace
  19. 19. Fraud Detection ● Criminals are committing fraud against banking sector. ● In the past: ○ Significant human intervention. ○ Desired outcome to improve accuracy. ● Today: ○ Machine Learning and Big data analytics
  20. 20. Social Media Analytics ● Huge Amount of data, Millions of posting. ● Metadata is valuable. ○ Data about data, such as location information and timestamps. ● IBM personality insights product. ○ Uncover a deeper understanding of customers personality to companies.
  21. 21. Data Science Skills
  22. 22. Data Mining and Analytics Skill 1. Classification: ● Constructs a model with knows labels. ● Data represented into discrete sets. ● Can categorize trustworthy and not trustworthy users for an online banking system.
  23. 23. Data Mining and Analytics Skill 2. Prediction: ● Builds a model that predict a continuous or ordered values. ● These models can predict for example, mean time to failures for computers.
  24. 24. Data Mining and Analytics Skill 3. Clustering: ● Is a process of grouping similar data objects into a class. ● Helps reveal features that distinguish one class of data objects from the other, leading to new discoveries on a dataset. ● As an example, clustering can reveal people with similar purchasing behaviours.
  25. 25. ● Machine Learning Skill ● Machine learning is based on self-learning or self-improving algorithms. ● In machine learning, a computer starts with a model, and continues to enhance it through trial and error. ● It can then provide meaningful insight in the form of classification, prediction, and clustering.
  26. 26. ● Machine Learning Skill ● A data scientist needs to be familiar with models that commonly used in Data such as: ○ Logistic regression. ○ Support vector machines. ○ Bayesian methods.
  27. 27. ● Statistics Skill ● Lays a foundation for data science. ● The more you know about it, the better. ● At minimum, you need to know: ○ Probability. ○ Correlations. ○ Variables, distributions, and regression. ○ Null hypothesis significance tests. ○ Confidence intervals, ANOVA, t-tests, and chi-square ○ Tools like: ■ R, Excel.
  28. 28. ● Visualization Skill ● Important skill to overcome the challenge of effectively communicating the results of data analytics to an audience. ● Tableau offers one of the most popular and comprehensive visualization tools for data scientists. It supports a variety of visualization elements such as different types of charts, graphs, maps.
  29. 29. ● Programming Skill ● Ability to code in at least one of the programming languages such as Python, Java, or Scala. ● Many languages have powerful libraries to clean and process your data (pandas) ● Along with powerful libraries to build machine learning models (i.e. sci-kit learn)
  30. 30. ● Big-Data Analytics Skills
  31. 31. ● Data Team Skills Variety
  32. 32. Data Science Roles and Career Paths
  33. 33.
  34. 34.
  35. 35.
  36. 36.
  37. 37.
  38. 38.
  39. 39.
  40. 40.
  41. 41. A day in the life of data scientist
  42. 42. Questions?
  43. 43. Thanks!