2.
What is Data Science?
Data Science is, in general terms,
the extraction of knowledge from
data
3.
What is Data Science?
Data is increasingly cheap and ubiquitous. We
are collecting and analyzing data,
unprecedented in variety, complexity and
scale.
At the same time, new technologies are
emerging to organize and make sense of this
avalanche of data.
4.
What is Data Science?
Data Science is an interdisciplinary subject
employing concepts and techniques from
mathematics, statistics, computer science
and economics.
It is used to identify patterns and regularities in
data, affecting all aspects of work and society
from medicine to marketing to scientific
research.
5.
Who is a Data Scientist?
A data scientist is someone who is
better at statistics than most
software engineers and better at
software engineering than most
statisticians
6.
Who is a Data Scientist?
A Data Scientist is a professional
with the training and curiosity to
make discoveries while swimming in
an ocean of data; communicating
what they learn and suggesting its
implications for new decisions.
7.
Who is a Data Scientist?
They identify and combine rich and potentially
incomplete data sources, and bring structure to
large quantities of formless data, making
analysis possible.
They engage decision makers in an ongoing
conversation based on the implications of the
data for products, processes, and decisions.
8.
Who is a Data Scientist?
★ A Data Scientist should have solid
quantitative and analytic skills
Statistical
Modelling
Experimental
Design
Bayesian
Inference
Machine
Learning
Information
Theory
Complex
Systems
9.
Who is a Data Scientist?
★ A Data Scientist should be a good
programmer
Scripting:
e.g. python
Statistical
Packages: e.g. R
Databases: SQL
and NoSQL
MapReduce
concepts
Hadoop and
Hive/Pig
Computer
Science
10.
Who is a Data Scientist?
In addition, a Data Scientist should
★ excel at communication and visualization
★ understand economics and business
concepts
★ be curious and creative
12.
Demand for Data Scientists
There is a growing demand for data-savvy
professionals in businesses, public agencies,
and nonprofits.
There is a limited supply of professionals who
can efficiently work with data at scale.
Thus, the salaries for data engineers, data
scientists, statisticians, and data analysts
have increased rapidly.
13.
A recent study by the McKinsey Global
Institute estimates that there will be four to
five million jobs in the U.S. requiring data
analysis skills by 2018, and that large numbers
of positions will only be filled through training
or retraining.
14.
In a survey of 816 data professionals in 53
countries, O’Reilly Media report a median
annual salary for Data Science professionals
as $98,000.
SQL, R, Python and Excel are the top earning
skills.
15.
Data Science in India
According to a survey by Gartner
★ In 2013, the Data Analytics market in India
was $1.6 Billion with a growth rate of 8%
★ By 2018, the market is projected to be $3.7
Billion
"For the fourth year in a row, analytics ranks as the No.
1 priority in Gartner's CIO [India] Survey." Bhavish Sood,
research director at Gartner explains.
16.
India is one of the strongest countries in the Data
Science marketplace that boasts of clients including
Facebook, GE, NASA, Tesco and Merck. It can
potentially build a talent pipeline for data scientists that
are virtually non-existent today.
India will need 200,000 data scientists in the next few
years. A single company, Wipro, already has as many as
8,000 people in analytics functions.
17.
Data Science in India
The median annual salary for a Data Scientists in
India is Rs 670,665
The highest paying skills are
Python, Machine Learning,
Statistical Analysis, Big Data
Analytics, and R.
18.
Bengal Chamber proposes smart and
green city for business analytics firms
The Bengal Chamber of Commerce and Industry has
taken an initiative to set up a smart city for business
analytics in West Bengal.
The project would involve service providers like KPMG
Advisory Services and PricewaterhouseCoopers,
corporate consumers, education institutions such as
Indian Institute of Technology Kharagpur, the Indian
Statistical Institute, and the Indian Institute of
Management, Calcutta.
19.
How can you be a Data Scientist?
A Master’s degree is a natural route to be a Data
Scientist.
Massive Open Online Courses (MOOCs) give access to
self-learning at a low cost (often free), but leave it to the
student to identify a suitable set of courses and tools to
round out a coherent skill set.
Bootcamps offer students a practical and structured
learning environment at a far more affordable rate
compared with obtaining a Master’s Degree.
20.
Master’s Degree
Duration 9 - 20 months
Faculty University Professors
Learning Theory and Assignments
Outcome Degree
Projects Practicum and Internship
Placement University Recruiting
Examples UC Berkeley, NYU, NCSU
IIT+IIM+ISI
Tuition $20,000 - $70,000 (US)
₹20,000,000 (India)
21.
Self-Learning (MOOCs)
Duration 6 - 18 months (part time)
Faculty University Professors
(recorded lectures)
Learning Self guided
Outcome Certificate
Projects Projects on own time
Placement Self-driven job search
Examples Coursera, Udacity
Tuition Free- $500 (US)
22.
Bootcamps
Duration 2 - 3 months
Faculty Professors & Data Scientists
Learning Experiential Learning
Outcome Certificate and Portfolio
Projects Built-In Projects
Placement Hiring Day and
Placement Assistance
Examples Zipfan, Metis, Data Incubator
Tuition Free - $16,000 (US)
23.
The Course
Data+Science: A First Course is an intensive
eight-week program based on the bootcamp
model, organized by The Data+Science
Initiative.
It is designed to teach and train graduates in
quantitative fields to take an entry-level
position as a data scientist.
24.
Objectives of the Course
Upon graduating a student will:
1. Have a clear understanding of and practical
experience with the process of designing,
implementing, and communicating the results of a
data science project.
2. Understand the landscape of data science tools and
their applications, and be prepared to identify and
dig into new technologies and algorithms needed
for the job at hand.
25.
Overview
Data science gives valuable meaning to large sets
of complex and unstructured data.
The focus is around concepts and techniques to
mine, store, analyse and visualize data.
Data science is a highly interdisciplinary drawing
from fields such as computer science (algorithms
and databases), statistics (hypothesis testing and
inference), artificial intelligence (pattern
recognition and machine learning).
26.
Course Content
Data Mining (⅛):
identifying data sources; extracting, cleaning
and verifying structured and unstructured data
Data Storage (¼):
structuring, storage and retrieval of data;
including big data and NoSQL
Data Analysis (½):
descriptive and inferential analysis; predictive
modelling, risk analysis and decision making
Data Visualization (⅛)
27.
Course Content
Graduating students will:
1. Be proficient in statistical concepts and
mathematical techniques including correlation
functions, inference and hypothesis testing.
2. Be able to make predictive analyses by modelling
stochastic processes based on available data.
3. Learn and apply Machine Learning concepts to
solve data science problems
28.
Course Content
4. Be capable coders in Python and R, including the
related packages and toolsets most commonly
used in data science.
5. Know the fundamentals of data visualization and
have experience creating static and dynamic data
visuals using JavaScript and D3.js.
6. Have introductory exposure to big data tools and
architecture such as the Hadoop stack, know when
these tools are necessary, and be poised to quickly
train up and utilize them in a big data project.
29.
Prerequisites
Basic Statistics and Probability
descriptive statistics and distributions
Linear Algebra
vectors and matrices
Calculus and Differential Equations
basic calculus and finding extrema, ordinary
differential equations
Programming
basic proficiency in any programming language
30.
Preferred Subjects
Computer Science
algorithms, data structures and databases
Advanced Statistics
bayesian inference and stochoastic processes
Statistical Mechanics/Information Theory
entropy, information, complexity
Economics
supply/demand, game theory
Web Development
HTML, CSS and Javascript
31.
Eligibility
Anyone meeting the prerequisite criteria is
eligible, determined by a qualifying exam, with
preference given to those with knowledge of
the preferred subjects.
However, we would prefer applicants to have a
bachelor’s degree in a quantitative field, such
as: Engineering, Physics, Mathematics,
Statistics, Economics or Computer
Applications.
32.
Course Details
The course consists of 24 classes over 8 weeks.
Each class (Mondays, Wednesdays, Fridays) is 6
hours in duration (10AM-4PM) including a lunch
hour.
Morning sessions consists of lectures and
discussions while the afternoons is a guided
programming session.
In addition, instructors will be available for office
hours at scheduled times.
33.
Course Projects
The course is divided into three parts.
Part A (Weeks 1-4): daily programming projects
executed individually or in groups
Part B (Weeks 5-8): weekly projects in groups
drawn from the industry
Part C (Weeks 9-11, optional): course project in
groups with biweekly meetings with instructors
34.
Benefits
Employment: Students will have the skill set and
portfolio to find employment as an entry level
data scientist. Such a skill set is in great demand,
both domestically as well as in developed
countries.
Research: Since Data Science is at the core of
academic research, our students, armed with the
knowledge, portfolio and recommendation will
find easier admission to universities, especially
abroad.
Parece que tem um bloqueador de anúncios ativo. Ao listar o SlideShare no seu bloqueador de anúncios, está a apoiar a nossa comunidade de criadores de conteúdo.
Odeia anúncios?
Atualizámos a nossa política de privacidade.
Atualizámos a nossa política de privacidade de modo a estarmos em conformidade com os regulamentos de privacidade em constante mutação a nível mundial e para lhe fornecer uma visão sobre as formas limitadas de utilização dos seus dados.
Pode ler os detalhes abaixo. Ao aceitar, está a concordar com a política de privacidade atualizada.