2. Content
1. FAQ - Summary of the course
2. Understanding the Role: Data Science Career Overview
3. Demand, supply and the Job Market
4. Salaries in Data Science
5. Typical Data Science Course
6. Excelling as Data Scientist
3. Data Science Career Overview
● Why a choose a career in
Data Science?
● Data Science is
Interdisciplinary
● Data Science Position and
Titles
● Thoughts on Higher
Education
4. Intro to Data Science - What is Data Science
What is Data Science?
Why do we need Data Science
What does a Data Scientist do
How the Data Scientist everyday looks like
Data Science Roles
Salary Range
How does a typical Data Science project
work
Future of Data Science
Will Data Science be on Demand
5. Prerequisites
How much Math's / Stat is required?
How many Machine Learning algorithms should I
know?
Importance of having a Master or PhD
What is the minimum basic requirements to start
a career in Data Science?
What background do you need to start a career
in Data Science?
What is the most important habit for becoming a
good Data Scientist?
What are the best 3 qualities to have in Data
Science?
6. Pathways to study Data Science
What are the main skills to learn?
Statistics topics to learn
What programming languages you should
know
R vs. Python vs Scala
Do I need to know R & Python or will only
one do?
Recommended Books
Resources to learn
How to become a Top Level Data Scientist
Where to get practical experience
7. Portfolio & Resume Prep
Should I create a blog or portfolio in order to get a
DS job?
Best places to promote your skills
Facebook / Linkedin Groups
Github Kaggle
How to stand out from the crowd
How to prepare a CV / Interview
What questions should you expect
How to prepare yourself for the interview
8. Understanding the
Role
1. Data Science vs Artificial
Intelligence vs Machine
Learning
2. Data Science vs Data Analyst
vs Business Intelligence
5-10 Minutes
9. Data Scientist
The title “data scientist” is relatively new and is not yet clearly defined. Due to the
fact that it lacks specificity it can sometimes be perceived as an elevated synonym
for “data analyst.” But that’s not the case. A data scientist possesses a combination
of analytic, machine learning, data mining, and statistical skills in addition to
experience with algorithms and coding.
Data scientists also have expertise in the following programs: R, SAS, Python,
Matlab, SQL, Hive, Pig, and Spark. But maybe the most important skill that a data
scientist possesses is the ability to explain the significance of data in a way that can
be easily understood by others.
24. Scope of Business Analytics
Descriptive analytics
- uses data to understand past and present
Predictive analytics
- analyzes past performance
Prescriptive analytics
- uses optimization techniques
25. Scope of Business Analytics
Example 1.1 Retail Markdown Decisions
Most department stores clear seasonal inventory by
reducing prices.
The question is:
When to reduce the price and by how much?
Descriptive analytics: examine historical data for similar
products (prices, units sold, advertising, …)
Predictive analytics: predict sales based on price
Prescriptive analytics: find the best sets of pricing and
advertising to maximize sales revenue
26. Data for Business Analytics
Four Types Data Based on Measurement Scale:
Categorical (nominal) data
Ordinal data
Interval data
Ratio data
27. Data for Business Analytics
Example 1.3
Classifying Data Elements in a Purchasing Database
28. Data for Business Analytics
Example 1.3 (continued)
Classifying Data Elements in a Purchasing Database
29. Data for Business Analytics
Categorical (nominal) Data
Data placed in categories according to a specified
characteristic
Categories bear no quantitative relationship to one
another
Examples:
- customer’s location (America, Europe, Asia)
- employee classification (manager, supervisor,
associate)
30. Data for Business Analytics
Ordinal Data
Data that is ranked or ordered according to some
relationship with one another
No fixed units of measurement
Examples:
- college football rankings
- survey responses
(poor, average, good, very good, excellent)
31. Data for Business Analytics
Interval Data
Ordinal data but with constant differences
between observations
No true zero point
Ratios are not meaningful
Examples:
- temperature readings
- SAT scores
32. Data for Business Analytics
Ratio Data
Continuous values and have a natural zero point
Ratios are meaningful
Examples:
- monthly sales
- delivery times
33. Unstructured Data
Mapreduce Big Data
NoSQL Databases
Cleaning and Wrangling
http://159.89.224.205/wp-content/uploads/2016/02/tumblr_inline_o21df5eSYo1sleek4_540.png
34. Big data, draws from a number of sources: structured data and
unstructured data. Structured data is organized, typically by
categories that make it easy for a computer to sort, read and organize
automatically.
Unstructured data, the fastest growing form of big data, is more
likely to come from human input — customer reviews, emails, videos,
social media posts, etc.
Typically, businesses employ data scientists to handle this
unstructured data, whereas other IT personnel will be responsible for
managing and maintaining structured data
35.
36. How many Machine Learning algorithms should I
know?
Decision tree
Random forest
Logistic regression
Support vector machine
Naive Bayes
k-NearestNeighbor
k-means
Adaboost
Neural network
Markov
37.
38. Artificial Intelligence vs Machine Learning
Machines Will Do Half Our Work By 2025 (Forbes Sep 2018).
Artificial Intelligence is the broader concept of machines being able to carry out
tasks in a way that we would consider “smart”. Artificial Intelligences – devices
designed to act intelligently. ML and neural networks.Python Automation
Source: https://www.forbes.com/sites/patrickwwatson/2018/09/27/machines-will-do-half-our-work-by-2025/#204a1b255e2a
41. Demand and Supply of Data Science Professional
Bridging The Data Scientist Talent Gap Starts With Defining The Current Role
(Forbes June 2018). Demand for data science and analytics skills? New job postings
to reach 2.72M in 2020 (BHWS PWC 2017). Annual demand for the fast-
growing new roles of data scientist, data developers, and data engineers will reach
nearly 700,000 openings by 2020. By 2020, the number of jobs for all US
data professionals will increase by 364,000 openings to 2,720,000
according to IBM.
42. IT Spending, Freelancing and Hiring Trends
IT spending is projected to reach about $3.85 trillion in 2019, up 2.8% from
2018. 36% of the workforce is contract-based or freelance talent with
projections showing freelancers will outnumber non-freelancers in the U.S. by
2027. Predictive analytics algorithms monitor 3GB of data every
second streaming from millions of network interfaces. What's Coming: Tech
Hiring Predictions For 2019 (Forbes June 2018). The Amazing Ways Verizon Uses
AI And Machine Learning To Improve Performance.
43. Future Jobs - Machines taking away Jobs
Deep Learning is used by Googlein its voice and image recognition
algorithms, by Netflix and Amazon to decide what you want or buy. ML is
described as a sub-discipline of AI. The Workforce Needs AI -- But AI Needs Human
Workers, Too (Forbes Nov 2018). AI is expected to be able to write a high school
essay and drive a truck better than a human can, have a 50% chance of
outperforming all human tasks within 45 years and automate all jobs in the
next century. 14-54%of the U.S. workforce could see their jobs automated in
the next two decades. Let The Robots Take Over: How The Future Of AI Will Create
More Jobs (Forbes Dec 2018)
44. Future of Job Market
75% of finance departmentswill employ automation by 2020.
Jobs taken away from Artificial Intelligence. Robots Aren't Coming For Jobs: AI Is
Already Taking Them (Forbes Oct 2018).
Credit Suisse using deep neural networks, random forest and NLP to
eliminate analyst jobs (Waterstechnology 2019). What Is The
Difference Between Deep Learning, Machine Learning and AI? (Forbes Dec 2017).
10 Amazing Examples Of How Deep Learning AI Is Used In Practice? (Forbes Dec
2018). Machine Learning And AI Will Disrupt All Careers. Eight Ways Big Data And
AI Are Changing The Business World (Forbes 2018)
48. Data Science Job - Demand & Salary
Data Scientist has been named the best job in America for three years running,
with a median base salary of $110,000and 4,524 job openings.Data Scientist
Is the Best Job In America According Glassdoor's 2018 Rankings (Forbes Jan 18).
49. Data Science Jobs
Data science is a fast growing and lucrative field, with the BLS predicting jobs in
this field will grow 11 percent by 2024. Data scientist is also shaping up to be a
satisfying long-term career path. According to data from Robert Half's 2018
Technology and IT Salary Guide, the average salary for data scientists, based on
experience, breaks down as follows:
25th percentile: $100,000
50th percentile: $119,000
75th percentile: $142,750
95th percentile: $168,000
52. Introduction
● The difference Data Science vs Machine Learning vs Artificial Intelligence vs
Data Analytics. How is the industry and HR using them while writing job
description?
● You will learn to use Python to help you acquire, parse and model your data.
● A significant portion of the course will be a hands-on approach to the
fundamental modeling techniques and machine learning algorithms that
enable you to build robust predictive models of real-world data and test their
validity.
● Seemingly enough, Scala Hadoop and other tech is faster which might be one
level closer to production. The idea if the course remain to develop analytical
thought process. Lot of Data Wrangling terms and concepts remain same
which is language agnostic.
53. What is inside the typical Data Science Course
● Mathematics
● Statistics
● Python statistical techniques in Python & Data Visualization
● Machine Learning
● Big Data Engineering
● Deep Learning
Pre-Works: Introductory Python (Optional), Data Analysis and Visualization with
Python, Statistics
54. How much Math's / Stat is required?
Logarithm, exponential, polynomial functions, rational numbers.
Basic geometry and theorems, trigonometric identities.
Real and complex numbers and basic properties.
Series, sums, and inequalities.
Graphing and plotting, Cartesian and polar coordinate systems, conic sections.
Linear algebra (and ideally basic multivariate calculus)
Regression linear regression and the things that violate the assumptions of linear
models (e.g., autocorrelation in time series data, non-independent observations)
Probability theory ... especially Bayes' Law and Central Limit Theorem
Numerical analysis (e.g., time series analysis and forecasting)
Core machine learning methods (clustering, decision trees, k-NN)
56. Excelling as Data Scientist
What Does It Take To Excel As A Data Scientist These Days? (Nov 2018).
Companies are only using about 12% of the data.Core Curriculum: Hadoop,
Spark, Machine Learning, Visualization. Specialization: Deep Learning, Data
Engineering & Big Data, Automation (DevOPs)
57. Technical Skills for Data Scientists
Math (e.g. linear algebra, calculus and probability)
Statistics (e.g. hypothesis testing and summary statistics)
Machine learning tools and techniques (e.g. k-nearest neighbors, random forests, ensemble methods, etc.)
Software engineering skills (e.g. distributed computing, algorithms and data structures)
Data mining
Data cleaning and munging
Data visualization (e.g. ggplot and d3.js) and reporting techniques
Unstructured data techniques
R and/or SAS languages
SQL databases and database querying languages
Python (most common), C/C++ Java, Perl
Big data platforms like Hadoop, Hive & Pig
Cloud tools like Amazon S3
http://www.bhef.com/sites/default/files/bhef_2017_investing_in_dsa.pdf
Data Scientist Is the Best Job In America According Glassdoor's 2018 Rankings
https://www.forbes.com/sites/louiscolumbus/2018/01/29/data-scientist-is-the-best-job-in-america-according-glassdoors-2018-rankings/#c8cd6555357e