O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Data science vs. Data scientist by Jothi Periasamy

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Data Scientist Toolbox
Data Scientist Toolbox
Carregando em…3
×

Confira estes a seguir

1 de 18 Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Data science vs. Data scientist by Jothi Periasamy (20)

Anúncio

Mais de Peter Kua (10)

Mais recentes (20)

Anúncio

Data science vs. Data scientist by Jothi Periasamy

  1. 1. DATA SCIENCE vs. DATA SCIENTIST A READINESS AND ASSESSMENT CREATE TALENT & TRANSFORM MALAYSIA TO DIGITAL ECONOMY Jothi Periasamy
  2. 2. 2 Objective Data science - defined Data scientist competency Appendix Contents Data science vs. enterprise data science Data scientist competency development approach THE institute of enterprise analytics (TIOEA) Our enterprise data science learning Data scientist competency development approach
  3. 3. 3 Strategy Implementa tion Executive Developers Data Science vs. Data Scientist Objective  Review key functions of data science and how data science different from traditional business intelligence (BI)  Understand key competency area, skills, roles and responsibilities and deliverables of data scientist
  4. 4. What is data science? Data science is not new, Data science is just modernizing existing reporting solution, analytics solutions, data warehousing solution, business intelligence solutions and even data management solutions. So Data science is … New thinking , New thoughts, New ideas, New data source, New data format/structure, New data architecture, New data processing mechanism, New innovation on data, and New way of solving problems. That’s all. Traditional Approach to Data & Analytics Data Source & Format  ERP, CRP, Oracle, SAP, MS SQL, etc.  Tables  Files Data Structure  Structured Data  ER Model (Entity Relationship)  MDM Model ( Multi Dimensional) Data Access  SQL  Filters & Aggregate Functions  Business Rules and Formulas, etc. Analytics  Reports  Dashboards  Data Analysis Analytics Transformation Data Source & Format  ERP, CRP, Oracle, SAP, MS SQL, etc.  Social ( Web, LinkedIn, Twitter, FB)  Streaming Data Structure  Structured Data  Unstructured & Semi Structured  Machine Data Data Access  Parallel processing  Distributed Computing  In-memory Analytics Analytics  Predicate Analytics ( Liner Regression)  Data Mining  Clustering , Segmentation , etc. Modern Approach to Data & Analytics = Data Science New Data Source & Format New Data Architecture New Analytics Architecture New Analytics Techniques DataData
  5. 5. 5 Computer Science Social Science Life Science Medical Science Material Science … Data Science Measurable Hidden Values  Computer Science  Social Science Data  Life Science Data  Medial Science Data  Material Science Data  … Social Data Medical Data Pharmacy Data Model Algorithm  Like computer science, social science, life science and other sciences , data science is also science to extract hidden values from any data by applying scientific, statistical, mathematical and computing techniques on it.  As you can see, Data science consists of all sciences together since data is there everywhere What is data science ? Continued …
  6. 6. 6 What is data science ? Continued … Structure Unstructured & Semi-Structured Machine Apply scientific, statistics, and mathematical techniques Financial & Billing Customer Behavior Cell Phone Call Record Predictive Analytics Advanced Analytics Data Discovery … much more Big Data Linear Regression Time Series & Neural Network Clustering … much more  Data science offers a powerful and new approach to making data discoveries by combining aspects of statistics, computer science, applied mathematics, and visualization together.  Data science can turn the vast amounts of data the digital age generates into new insights and new knowledge Data Data Data
  7. 7. 7 Data Science Project Scope Research & Development Enterprise & Industry  Developing new Models  Developing new Algorithm  New Analytics Techniques & Innovation  New Data Product or Platform Development  New Analytics Product or Platform Development  etc.  Solving Business Problems  Target marketing & reduce marketing spend  Consistent customer experience across all channel – create personalized customer experience … etc. Enterprise Data Scientist Typical Data Science Project Scope Data Scientist Typical Data Science Scope  Based on the projects that I have been involved, the scope & focus of a data scientist role differs but it’s very critical to understand the different focus area and deliverables of a data science project. Data Scientist Deliverables Data Scientist Deliverables Modernizing Existing Business Intelligence Solutions & Data Solutions
  8. 8. 8 Business Process Unstructur ed Data Semi Structured Data Structure Data Machine Data Analytics / Data Science Techniques Finance Customer Marketing Human Resource Supply Chain Industry Oil & Gas Media Telecommunication Power & Utility Retail etc. etc. Enterprise Data Science Framework Measurable Business Values Linear Regression Time Series Clustering Neural Network Association etc. Reduced 2% Cost Increased 5% Revenue etc. On an enterprise data science project, an enterprise data scientist expected to know the industry and it’s associated business process very well to lead, guide and deliver the project. Following are the core enterprise data science building blocks
  9. 9. 9 Data Science Project Scope Research & Development Enterprise & Industry Less focus on industry skill Less focus on business process skill Deeper Focus on Data skills Deeper Focus on Analytical skills Less focus on communication and people skills Deeper Technology skills Data Scientist Key Competency Area Deep focus on industry skill Deep focus on business process skill Data skills Analytical skills Strong communication and people skills Technical skills  There are many different skills that’s required to become a data scientist, but these are our key observations on skills that’s required to deliver a data science project.  Please note, we didn’t list specific skills under each area. For example, under Data, we will have data management, data governance, data quality, data modeling ,data architecture, data integration, data mapping , etc. Entry Level Senior Level Basic Skill Deeper Skill Basic Skill Deeper Skill Entry Level Senior Level Enterprise Analytics Transformati on Leader ___________ Industry Expert Technology Thought Leader ____________ PhD’s Academia
  10. 10. 10 Data Science Project Scope Research & Development Enterprise & Industry Less focus on industry skill Less focus on business process skill Deeper Focus on Data skills Deeper Focus on Analytical skills Less focus on communication and people skills Deeper Technology skills Deep focus on industry skill Deep focus on business process skill Data skills Analytical skills Strong communication and people skills Technical skills  I found upskilling industry professionals who has prior experience in BI, data, data warehousing would be a faster, stable and sustainable approach to deliver and support an enterprise data science project Entry Level Senior Level Basic Skill Deeper Skill Basic Skill Deeper Skill Entry Level Senior Level Data Scientist Competency Development Approach Who may be a best fit for data scientist ? Upskill on Industry and Business Process Upskill on Advanced Analytics and Data Science Techniques
  11. 11. 11 Key Takeaways  Visionary  Domain Expert  Innovator  Transformation leader  Change Agent  Data Expert  Analytical Thinker  Technology Thought Leader  Based on our industry experience, some of the key characteristics of data scientist on an enterprise analytics transformation initiatives as follows.  Key roles and responsibilities and deliverables of a data scientist on an enterprise data science projects Data Scientist Key Roles & Responsibilities Data Scientist Key Deliverables  Business Case  Strategy and Roadmap  Standards, Policies and Guidelines  Data Management Framework  Modern Enterprise Data Architecture – Big Data Lake  Modern Enterprise Analytics Architecture - Enterprise Data Science  Plan of Action – Tactical level Execution Plan  User Adoption  Tools and Templates and Accelerators Enterprise Analytics Transformation Initiative or Enterprise Data Science Project
  12. 12. 12 Key Takeaways  Data Engineer  Data Architect  Data Molder  ETL Developer  Information Modeler  Information Security Expert  Data Analyst  Data Visualization Engineer  etc.  Other roles and responsibilities that may involve in an Enterprise Analytics Transformation Initiative or Enterprise Data Science Project  Please note, these roles are not a mandatory roles, it may or may not even exists, these roles are subject to change, it’s dependents on project scope and objectives. Others Roles & Responsibilities Other Deliverables  Data Provisioning Functional and Technical Components  Data Modeling  Information Modeling  Data Visualization Components  etc. Enterprise Analytics Transformation Initiative or Enterprise Data Science Project
  13. 13. 13 Appendix
  14. 14. 14 Industry Use Case Research and Innovation Consulting and Implementation Training && Talent Development Thought Leadership Tools and Templates THE institute of enterprise analytics (TIOEA) TIOEA  Create talent and jobs  Simplify data science learning and empower learner with industry use cases and pre- packaged business contents  Be a thought leader and governance model for enterprise data science implementation  Accelerate enterprise data science implementation with proven innovation lab, tools, templates, standards, polices and guiltiness
  15. 15. Fresher's Experienc ed Executives We provide practical coaching and on job learning experience  Enterprise Data Science for Executives  Enterprise Big Data for Executives  Enterprise HADOOP for Executives  Enterprise Data Science for Architects  Enterprise Big Data for Architects  Enterprise HADOOP for Architects  Enterprise Data Science for Developers  Enterprise Big Data for Developers  Enterprise HADOOP for Developers Learn to build Learn to deliver Learn to lead CAP (Certified Analytics Professional ) Role-based Learning TIOEA Draft
  16. 16. Functional Learning Industry Use Case Strategy & Roadmap Data Analytics User Experience Problem Statement Business Needs & Challenges Business Impacts and Benefits Implementation Methodology Implementation Options & Plan Deliverables and Milestones Data Governance Data Management Data Sources & Data Format Data Modeling Data Integration Design and Leading Practices Data Science Overview Data Science vs. Enterprise Data Science Predictive Analytics & Advanced Analytics Treditional “BI” vs. Data Science Analytics Techniques Design and Leading Practices Data Visualization Self Servicing and Data Analysis Reporting and Insights and Improved Decision Making Deployment - Desktop, Mobile and Cloud Design and Leading Practices Our Enterprise Science Lab ( HADOOP + SAP HANA + Oracle 12C + Analytics Tools + Open Source Technologies ) Learning Roadmap Draft
  17. 17. Technical Learning Industry Use Case “R” Programming Python Programming Machine Learning Enterprise HADOOP Problem Statement Business Needs & Challenges Business Impacts and Benefits Implementation Methodology Implementation Options & Plan Deliverables and Milestones HADOOP Overview HADOOP Architecture HADOOP Core Components Data Management On HADOOP Analytics & Application On HADOOP HADOOP Ecosystem and Total Cost of Ownership Enterprise Data Science Overview Data Science vs. Enterprise Data Science Predictive Analytics & Advanced Analytics Analytics on “R” Overview Analytics on “Python” Overview Analytics on “Natural Language Processing ” Data Visualization Treditional “BI” vs. Data Science Self Servicing and Data Analysis Insights and Improved Decision Making Deployment on Desktop, Mobile and Cloud Change Management and Training Our Enterprise Data Science Lab ( HADOOP + SAP HANA + Oracle 12C + Analytics Tools + Open Source Technologies ) Big Data Enabling Technologies Cloud Technologies Overview SAP Analytics Tools Overview Oracle Analytics Tools Overview Open source Technologies Overview Data Management Technologies Overview Data Management Implementation Overview Learning Roadmap Draft
  18. 18. 18 Thank You !!

×