O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Paper presentation

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 14 Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Paper presentation (20)

Anúncio

Mais recentes (20)

Paper presentation

  1. 1. The Ambiguity of Data Science Team Roles and the Need for a Data Science Workforce Framework Authors- Jeffrey S. Saltz, Nancy W. Grady Presented by- K.K. Tripathi (Course- CS795: Intro to Data Science) October 18th, 2018
  2. 2. Introduction 2 Aim  Enable organizations to staff their data science teams more accurately with the desired skillsets.  Providing job titles and job descriptions that are more clearly identify tasks, knowledge, skills, and abilities that can benefit the data science community.  Remove the overloading of the term data scientist. Objective  To address this challenge, this paper frames and provides data science workforce definitions with examples.
  3. 3. Background 3 Issue  Generalization of the “Data Science” word. Problems  Difficulty to ascertain what skills are needed to perform the specific tasks required to build and deploy big data analytics (BDA) systems.  This lack of vocabulary creates many issues (e.g., identifying the appropriate person that should be hired for a specific role within a data science team).  There is not an agreed upon process model for data science (lack of process model).  Overlapping skillsets (Software development lifecycles).
  4. 4. Role based model by NICE (US DOD CWF) 4 Employers • Track staff skills • Training • Qualifications • Improve position descriptions • Develop career paths • Analyze proficiency Educators • Develop curriculum and conduct training for programs • Courses • Seminars for specific roles Technology providers • Identify work roles • Tasks • Knowledge • Skills • Abilities associated with products Based on the list of tasks, knowledge, skills, ability descriptions, a workforce framework map them to work roles. Domain benefits:
  5. 5. Methodology (case studies) 5 StandardOrganizations NIST EDISON Industryorganizations SAIC Springboard Advisorycompany Gartner Goal: To explore the commonality and diversity of the vocabulary used to describe roles within data science teams Qualitative case studies based on selected organizations.
  6. 6. NIST 6 Develop a big data reference architecture that categorizes the components of big data systems RA consists of 5 components and identifies their respective roles.  System Orchestrator: integrate the data app  Data Provider: introduces new data into the BDS  Big Data Application Provider  Big Data Framework provider  Data Consumer  Security and Privacy: interacts with sys. orch.  Management: big data life cycle eg. Package, software, and backup management
  7. 7. EDISON 7 An European Union funded project to build the data science profession EDSF (Edison data Science Framework) comprises several documents including DS professional profiles and the Model Curriculum  Data Scientist: merge, manage, interpret large data-sets  Data Science Researcher: applies scientific discovery research/process, hypothesis testing  Data Science Architect: create relevant data models and process workflow  Data Science Programmer: design, develops, code large data (science) analytics applications  Data/Business Analyst: extract info about system, services, or organization performance
  8. 8. SAIC 8 A system integrator works primarily for the federal gov., Including civilian, defense, and intelligence customers - Developed Data Science Edge (an internal process model) - Extends CRISP-DM process to align with big data  Information Architect: develops data models for optimal performance in databases.  Data Scientist: works in cross-functional teams at all stages of analysis lifecycle. - Follows a scientific approach to generate value from data  Metrics and Data: develops, inspects, mines, transforms, models data to improve productivity  Knowledge and Collaboration Engineer: design & implements tools  Big Data Engineer: works with the full open source Hadoop stack from cluster management to repository
  9. 9. Springboard 9 An online data science education startup. Defines 3 following roles:  Data Engineer: typically knows a variety of programming languages, focuses on coding, cleaning up data sets; takes the predictive model from the data scientist and implement it in coding  Data Scientist: bridge the gap between programming and implementation of data science, theory of data science, and the business implication of data  Data Analyst: provide visualizations and reports, explain insights  Data architect: focuses on structuring the technology that manages the data models.
  10. 10. Gartner 10 A research / advisory consulting firm. Basically, advise to upper level decision makers. Set of suggested roles:-  Data Scientists: extract various types of knowledge from data; end to end process  Data Engineers: make the data accessible and available for data scientists  Business Experts: business domain experts  Source System Experts: knowledge of data at the business application level  Software Engineers: for custom coding requirements  Quant Geeks: certain situations: “nice-to-have” but in rare situation: “must-have”  Unicorns: well versed data scientists
  11. 11. Discussion (integrated view of roles used) 11 Search phrases used on Dice.com
  12. 12. Data Scientist vs. Data Engineer 12 Most frequent key phrases used in job descriptions:
  13. 13. Future & Conclusion 13  Future: Next changes in future will occur in cases such as: Blending of data-intensive and compute-intensive applications eg. Rise of High Performance Data Analytics (HPDA)  Conclusion: Rerun of an analysis is required of role usage in the industry in the future (every 6 months) to identify trends over time
  14. 14. 14 Thank you

Notas do Editor

  • NIST – National Institute of Standard and Technology developed a cybersecurity Workforce Framework - NICE (National Initiative for cyber security framework)
  • National Institute of Standards and Technology
  • Science Applications International Corporation (SAIC) parent company changed the name as Leidos

×