SlideShare uma empresa Scribd logo
1 de 137
Baixar para ler offline
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
An Introduction to Statistical Learning with Applications in R
The Elements of Statistical Learning
A Programmer’s Guide to Data Mining
Probabilistic Programming & Bayesian Methods for Hackers
Think Bayes, Bayesian Statistics Made Simple
Deriving Knowledge from Data at Scale
Data Mining and Analysis, Fundamental Concepts and Algorithms
An Introduction to Data Science
Machine Learning
Machine Learning – The Complete Guide
Deriving Knowledge from Data at Scale
Bayesian Reasoning and Machine Learning
A Course in Machine Learning
Information Theory, Inference and Learning Algorithms
Modeling with Data
Mining of Massive Datasets
Information Theory, Inference and Learning Algorithms
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
• Introduction to machine learning
• Overview of the machine learning process
• Introduction to select algorithms
• Introduction to concepts we will examine later in course:
bias/variance, generalization, underfitting, overfitting, ensemble
methods, etc.
• Elements of a time series
• Functions to manipulate time series
• Lay a foundation for Prediction and Forecasting…
Deriving Knowledge from Data at Scale
http://www.cs.waikato.ac.nz/ml/weka/
install by next class, get the developer version…
Deriving Knowledge from Data at Scale
• Class Discussions 20 minutes
• Machine Learning Primer 60 minutes
• Break 15 minutes
• Time Series & Forecasting, 1/2 60 minutes
• Closing discussion 15 minutes
Deriving Knowledge from Data at Scale
so we need teams
Deriving Knowledge from Data at Scale
Discussion on Assigned Reading…
Week One
Deriving Knowledge from Data at Scale
The Data Science Workflow
Lecture 1 Review
Stay in the immediate zone during exploratory modeling, to extent possible:
• 5 to 10 minutes per experiment, results in 100’s per day;
• Small, statistically sound/relevant samples;
• Linear modelling during feature exploration;
• Don’t write ML algorithms, use packages, do learn to write data manipulation code;
Start with a sample of data, as soon as you can get it…
• You will learn a considerable amount from that first data sample
• Quality of the data, what fields are being collected, possible missing values and/or
missing fields, and the questions will begin to flow (dialogue with customer will be at a
much more meaningful level).
If your customer can’t quantify it, you can’t change/improve it…
Deriving Knowledge from Data at Scale
The Data Science Workflow
Loops within loops…
Deriving Knowledge from Data at Scale
The Data Science Workflow
Define the Goal
The first task in a data science project is to define a measurable & quantifiable
goal. At this stage, learn all that you can about the context of your project.
• Why do the project sponsors want the project in the first place? What do they lack, and
what do they need?
• What are they doing to solve the problem now, and why isn't that good enough?
• What resources will you need: what kind of data, how much staff, will you have domain
experts to collaborate with, what are the computational resources?
• How do the project sponsors plan to deploy your results? What are the constraints that
have to be met for successful deployment?
Not We want to get better at finding bad loans.
But We want to reduce our rate of loan charge-offs by at least 10%, using a model
that predicts which loan applicants are likely to default.
Deriving Knowledge from Data at Scale
A concrete goal begets concrete stopping conditions and concrete
acceptance criteria. The less specific the goal, the likelier that the
project will go unbounded, because no result will be "good
enough." If you don't know what you want to achieve, you don't
know when to stop trying – or even what to try. When the project
eventually terminates – because either time or resources run out –
no one will be happy with the outcome…
Deriving Knowledge from Data at Scale
The Data Science Workflow
Data Collection and Management
This step encompasses identifying the data you need, exploring it, and
conditioning it to be suitable for analysis. This stage is often the most time-
consuming step in the process. It's also one of the most important.
• What data is available to me?
• Will it help me solve the problem?
• Is it enough?
• Is it of good enough quality?
Rule First piece of data is very informative, data set utility is roughly logarithmic in size.
Prefer Direct measurements, but if they are not available identify proxy variables.
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
The Data Science Workflow
Modeling
We get to statistics and machine learning during the modeling stage. Here is where
you try to extract useful insights from the data. Many modeling procedures make
specific assumptions about data distribution and relationships, there will be back-and-
forth between the modeling and data cleaning stages as you try to find the best way to
represent and model the data. The most common data science modeling tasks are:
• Classification: deciding if something belongs to one category or another.
• Scoring: predicting or estimating a numeric value such as a price or probability.
• Ranking: learning to order items by preferences.
• Clustering: grouping items into most similar groups.
• Finding Relations: finding correlations or potential causes of effects seen in the data.
• Characterization: very general plotting and report generation from data.
Deriving Knowledge from Data at Scale
The Data Science Workflow
Model Evaluation and Critique
Once you have a model, you need to determine if it meets your goals.
• Is it accurate enough for your needs?
• Does it generalize well?
• Does it perform better than "the obvious guess"? Better than whatever
estimate you currently use?
• Do the results of the model (coefficients, clusters, rules) make sense in the
context of the problem domain?
If you've answered "no" to any of the above questions, it's time to loop back to
the modeling step — or decide that the data doesn't support the goal you are
trying to achieve.
Deriving Knowledge from Data at Scale
The Data Science Workflow
Model Deployment and Maintenance
Finally, the model is put into operation.
In many organizations this means the data scientist no longer has primary
responsibility for the day-to-day operation of the model. However, you still
should ensure that the model will run smoothly and will not make disastrous
unsupervised decisions. You also want to make sure that the model can be
updated as its environment changes. And in many situations, the model will
initially be deployed in a small pilot program.
The test might bring out issues that you didn't anticipate, and you will have to
adjust the model appropriately.
Deriving Knowledge from Data at Scale
The Data Science Workflow
Presentation and Documentation
Once you have a model that meets your success criteria, you will present your results
to your project sponsor and other stakeholders. You must also document the model
for those in the organization who are responsible for using, running, and maintaining
the model once it has been deployed. Model interpretability may be an issue…
Deriving Knowledge from Data at Scale
The Data Science Workflow
Loops within loops…
Deriving Knowledge from Data at Scale
• Class Discussions 30 minutes
• Machine Learning Primer 60 minutes
• Break 15 minutes
• Time Series & Forecasting 60 minutes
• Closing discussion 15 minutes
Deriving Knowledge from Data at Scale
1 1 5 4 3
7 5 3 5 3
5 5 9 0 6
3 5 2 0 0
training data (expensive) synthetic training data (cheaper)
solve hard problems
value from Big Data
human intelligence
Machine learning enables nearly every
value proposition of web search.
Hundreds of thousands of machines…
Hundreds of metrics and signals per machine…
Which signals correlate with the real cause of a problem?
How can we extract effective repair actions?
solve hard problems
value from Big Data
human intelligence
business analytics
Predicting future performance from historical data
Recommenda-
tion engines
Advertising
analysis
Weather
forecasting for
business
planning
Social network
analysis
IT infrastructure
and web app
optimization
Legal
discovery and
document
archiving
Pricing analysis
Fraud
detection
Churn
analysis
Equipment
monitoring
Location-based
tracking and
services
Personalized
Insurance
Predictive analytics
address the likelihood of
something happening in
the future, even if it is
just an instant later…
object encoded with features
(think DB attributes/ OO member fields of primitive types)
𝑑 is the feature dimensionality.
classifier
prediction
(response/dependent variable).
Can be qualitative/quantitative
(classification/regression).
𝑶𝒃𝒋𝒆𝒄𝒕 → 𝑶𝒖𝒕𝒄𝒐𝒎𝒆
Entity → Category
Entity → PopularityComplex decision making:
𝑿 → 𝒀
𝑿 = (𝒙 𝟎, … , 𝒙 𝒅) → 𝒀
input/independent variable
We may know the relation for certain values of 𝑋 and 𝑌:
In fact, we may know the relation for many 𝒙s and 𝑦s:
𝒙, 𝑦
𝒙 𝟏 , 𝑦 𝟏 , … , 𝒙 𝑵 , 𝑦 𝑵 The 𝑖-th 𝑥 is: 𝒙(𝑖)
= 𝑥0
(𝑖)
, … , 𝑥 𝑑
(𝑖)
TRAINING
Input
𝒙 = (𝑥 𝟎, … , 𝑥 𝒅)
Online
System
object encoded with features
classifier
prediction
(response/dependent
variable)
Final
Output
𝑦
Model
Offline
Training
Sub-system
Training Data
𝒙 𝟏
, 𝑦 𝟏
, … , 𝒙 𝑵
, 𝑦 𝑵
where 𝒙(𝒊) = 𝑥 𝟎
(𝒊)
, … , 𝑥 𝒅
(𝒊)
𝑓 𝑋 = 𝑌𝑋 → 𝑌 Task is very complex . Hard to construct good 𝑓.
We construct an approximation to 𝑓: 𝑔(𝑋) ≈ 𝑌
Hypothesis space: 𝑔(𝑋) ∈ 𝐻.
Deriving Knowledge from Data at Scale
The decision is driven by both the nature of your
data and the question you are trying to answer
Deriving Knowledge from Data at Scale
Out of Class Reading
Week Two
gender age smoker eye
color
male 19 yes green
female 44 yes gray
male 49 yes blue
male 12 no brown
female 37 no brown
female 60 no brown
male 44 no blue
female 27 yes brown
female 51 yes green
female 81 yes gray
male 22 yes brown
male 29 no blue
lung
cancer
no
yes
yes
no
no
yes
no
no
yes
no
no
no
male 77 yes gray
male 19 yes green
female 44 no gray
?
?
?
gender age smoker eye
color
male 19 yes green
female 44 yes gray
male 49 yes blue
male 12 no brown
female 37 no brown
female 60 no brown
male 44 no blue
female 27 yes brown
female 51 yes green
female 81 yes gray
male 22 yes brown
male 29 no blue
lung
cancer
no
yes
yes
no
no
yes
no
no
yes
no
no
no
male 77 yes gray
male 19 yes green
female 44 no gray
?
?
?
Train
ML Model
gender age smoker eye
color
male 19 yes green
female 44 yes gray
male 49 yes blue
male 12 no brown
female 37 no brown
female 60 no brown
male 44 no blue
female 27 yes brown
female 51 yes green
female 81 yes gray
male 22 yes brown
male 29 no blue
lung
cancer
no
yes
yes
no
no
yes
no
no
yes
no
no
no
male 77 yes gray
male 19 yes green
female 44 no gray
yes
no
no
Train
ML Model
Always
focuses on a
bias-variance decomposition We have a lecture dedicated to evaluation
metrics for models.
key
easily the most important factor
key
CITY 1 LAT. CITY 1 LNG. CITY 2 LAT. CITY 2 LNG. DRIVABLE?
123.24 46.71 121.33 47.34 Yes
123.24 56.91 121.33 55.23 Yes
123.24 46.71 121.33 55.34 No
123.24 46.71 130.99 47.34 No
key
Feature engineering
More data wins
Once you’ve defined your input fields, there’s only so much analytic
gymnastics you can do. Computer algorithms trying to learn models have
only a relatively few tricks they can do efficiently, and many of them are
not so very different. Performance differences between algorithms are
typically not large. Thus, if you want better classifiers:
1. Engineer better features
2. Get your hands on more high-quality data
The power of ensembles
models vote for the final prediction
Occam’s Razor
smaller faster to fit
more interpretable
representable
could will
observational data can only show us that two
variables are related, but it cannot tell us the “why”
Freakonomics
tended to have higher standardized test scores
Deriving Knowledge from Data at Scale
Break, 5 minutes…
Deriving Knowledge from Data at Scale
boosting)
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
Ensemble Classification
Deriving Knowledge from Data at Scale
Let’s look at the Netflix Prize Competition…
Deriving Knowledge from Data at Scale
Began October 2006
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
from http://www.research.att.com/~volinsky/netflix/
However, improvement slowed…
Deriving Knowledge from Data at Scale
Today, the top team has posted
a 8.5% improvement.
Ensemble methods are the best
performers…
Deriving Knowledge from Data at Scale
“Thanks to Paul Harrison's
collaboration, a simple mix of our
solutions improved our result from
6.31 to 6.75”
Rookies
Deriving Knowledge from Data at Scale
“My approach is to combine the
results of many methods (also two-
way interactions between them)
using linear regression on the test
set. The best method in my
ensemble is regularized SVD with
biases, post processed with kernel
ridge regression”
Arek Paterek
http://rainbow.mimuw.edu.pl/~ap/ap_kdd.pdf
Deriving Knowledge from Data at Scale
“When the predictions of multiple
RBM models and multiple SVD
models are linearly combined, we
achieve an error rate that is well
over 6% better than the score of
Netflix’s own system.”
U of Toronto
http://www.cs.toronto.edu/~rsalakhu/papers/rbmcf.pdf
Deriving Knowledge from Data at Scale
Gravity
home.mit.bme.hu/~gtakacs/download/gravity.pdf
Deriving Knowledge from Data at Scale
“Our common team blends the result
of team Gravity and team Dinosaur
Planet.”
Might have guessed from the name…
When Gravity and
Dinosaurs Unite
Deriving Knowledge from Data at Scale
And, yes, the top team which is from
AT&T…
“Our final solution
(RMSE=0.8712) consists of
blending 107 individual results. “
BellKor / KorBell
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
• 83.7% majority vote accuracy
• 99.9% majority vote accuracy
Intuitions
Deriving Knowledge from Data at Scale
Boosting
Bagging
Deriving Knowledge from Data at Scale
Leo Breiman
Deriving Knowledge from Data at Scale
• Resampl few data
• Resample the minority data skew
Deriving Knowledge from Data at Scale
N=10
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
Break, 5 minutes…
Deriving Knowledge from Data at Scale
Time Series & Forecasting, 1/2
Deriving Knowledge from Data at Scale
median filter, a windowing technique that moves through the data point-by-point, and
replaces it with the median value calculated for the current window…
Deriving Knowledge from Data at Scale
Visual inspection reveals smoothing of the outliers without dampening the naturally
occurring peaks and troughs (no signal loss). Prior to smoothing, we could see no correlation
in our data, but afterwards, Spearman’s Rho was ~0.5 for almost all parameters.
Deriving Knowledge from Data at Scale
?
?
?
Deriving Knowledge from Data at Scale
3.7
12.5
9.0
Deriving Knowledge from Data at Scale
Forecasting
Quantitative
Causal
Model
Trend
Time series
Stationary
Trend
Trend +
Seasonality
Qualitative
Expert
Judgment
Delphi
Method
Grassroots
Deriving Knowledge from Data at Scale
Year
1
Year
2
Year
3
Year
4
Demandforproductorservice
Deriving Knowledge from Data at Scale
Year
1
Year
2
Year
3
Year
4
Demandforproductorservice
Trend component
Actual
demand line
Seasonal peaks
Random
variation
Deriving Knowledge from Data at Scale87
Deriving Knowledge from Data at Scale88
Deriving Knowledge from Data at Scale
• Naïve
• Moving Average
• Exponential Smoothing
• Regression
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
91
Deriving Knowledge from Data at Scale92
smoothing time
n
A+...+A+A+A
=F 1n-t2-t1-tt
1t


Ft+1 = Forecast for the upcoming period, t+1
n = Number of periods to be averaged
A t = Actual occurrence in period t
Deriving Knowledge from Data at Scale
3
n
A+...+A+A+A
=F 1n-t2-t1-tt
1t


Month
Sales
(000)
1 4
2 6
3 5
4 ?
5 ?
6 ?
Deriving Knowledge from Data at Scale
Month
Sales
(000)
Moving Average
(n=3)
1 4 NA
2 6 NA
3 5 NA
4 ?
5 ?
(4+6+5)/3=5
6 ?
n
A+...+A+A+A
=F 1n-t2-t1-tt
1t


You’re manager in Amazon’s electronics department. You want to forecast ipod
sales for months 4-6 using a 3-period moving average.
Deriving Knowledge from Data at Scale
Month
Sales
(000)
Moving Average
(n=3)
1 4 NA
2 6 NA
3 5 NA
4 3
5 ?
5
6 ?
?
Deriving Knowledge from Data at Scale
Month
Sales
(000)
Moving Average
(n=3)
1 4 NA
2 6 NA
3 5 NA
4 3
5 ?
5
6 ?
(6+5+3)/3=4.667
Deriving Knowledge from Data at Scale
Month
Sales
(000)
Moving Average
(n=3)
1 4 NA
2 6 NA
3 5 NA
4 3
5 7
5
6 ?
4.667?
Deriving Knowledge from Data at Scale
Month
Sales
(000)
Moving Average
(n=3)
1 4 NA
2 6 NA
3 5 NA
4 3
5 7
5
6 ?
4.667
(5+3+7)/3=5
Deriving Knowledge from Data at Scale99
Deriving Knowledge from Data at Scale
100
Deriving Knowledge from Data at Scale10
Deriving Knowledge from Data at Scale
Month Weighted
Moving
Average
1 4 NA
2 6 NA
3 5 NA
4 31/6 = 5.167
5
6 ?
?
?
1n-tn2-t31-t2t11t Aw+...+Aw+Aw+Aw=F 
Sales
(000)
Deriving Knowledge from Data at Scale
Month Sales
(000)
Weighted
Moving
Average
1 4 NA
2 6 NA
3 5 NA
4 3 31/6 = 5.167
5 7
6
25/6 = 4.167
32/6 = 5.333
1n-tn2-t31-t2t11t Aw+...+Aw+Aw+Aw=F 
Deriving Knowledge from Data at Scale10
Deriving Knowledge from Data at Scale10
Deriving Knowledge from Data at Scale
106
simple
any
• Previous
time
previous actual and forecasted value
Deriving Knowledge from Data at Scale
• gives more weight to recent time periods
Ft+1 = Ft + a(At - Ft)
et
Ft+1 = Forecast value for time t+1
At = Actual value at time t
a = Smoothing constant
Need initial
forecast Ft
to start.
Deriving Knowledge from Data at Scale
Week Demand
1 820
2 775
3 680
4 655
5 750
6 802
7 798
8 689
9 775
10
Given the weekly demand
data what are the exponential
smoothing forecasts for
periods 2-10 using a=0.10?
Assume F1=D1
Ft+1 = Ft + a(At - Ft)
i Ai
Deriving Knowledge from Data at Scale
Week Demand 0.1 0.6
1 820 820.00 820.00
2 775 820.00 820.00
3 680 815.50 793.00
4 655 801.95 725.20
5 750 787.26 683.08
6 802 783.53 723.23
7 798 785.38 770.49
8 689 786.64 787.00
9 775 776.88 728.20
10 776.69 756.28
Ft+1 = Ft + a(At - Ft)
a =
F2 = F1+ a(A1–F1) =820+.1(820–820)
=820
i Ai Fi
Deriving Knowledge from Data at Scale
Week Demand 0.1 0.6
1 820 820.00 820.00
2 775 820.00 820.00
3 680 815.50 793.00
4 655 801.95 725.20
5 750 787.26 683.08
6 802 783.53 723.23
7 798 785.38 770.49
8 689 786.64 787.00
9 775 776.88 728.20
10 776.69 756.28
Ft+1 = Ft + a(At - Ft)
a =
F3 = F2+ a(A2–F2) =820+.1(775–820)
=815.5
i Ai Fi
Deriving Knowledge from Data at Scale
Week Demand 0.1 0.6
1 820 820.00 820.00
2 775 820.00 820.00
3 680 815.50 793.00
4 655 801.95 725.20
5 750 787.26 683.08
6 802 783.53 723.23
7 798 785.38 770.49
8 689 786.64 787.00
9 775 776.88 728.20
10 776.69 756.28
Ft+1 = Ft + a(At - Ft)
This process
continues
through week
10
a =
i Ai Fi
Deriving Knowledge from Data at Scale
Week Demand 0.1 0.6
1 820 820.00 820.00
2 775 820.00 820.00
3 680 815.50 793.00
4 655 801.95 725.20
5 750 787.26 683.08
6 802 783.53 723.23
7 798 785.38 770.49
8 689 786.64 787.00
9 775 776.88 728.20
10 776.69 756.28
Ft+1 = Ft + a(At - Ft)
What if the
a constant
equals 0.6
a = a =
i Ai Fi
Deriving Knowledge from Data at Scale
α
• depends on the emphasis you want to place on the
most recent data
α
Deriving Knowledge from Data at Scale
Ft+1 = a At + a(1- a) At - 1 + a(1- a)2At - 2 + ...
Weights
Prior Period
a
2 periods ago
a(1 - a)
3 periods ago
a(1 - a)2
a=
a= 0.10
a= 0.90
10% 9% 8.1%
90% 9% 0.9%
Ft+1 = Ft + a (At - Ft)
or
w1 w2 w3
Deriving Knowledge from Data at Scale
115
Deriving Knowledge from Data at Scale
Trend
Seasonal
Deriving Knowledge from Data at Scale11
Deriving Knowledge from Data at Scale
0
1
2
3
4
5
6
7
1 2 3 4 5 6 7 8 9 10
Deriving Knowledge from Data at Scale
b0 b1
0
2
4
6
8
10
12
10 11 12 13 14 15 16 17 18 19 20
Best line!
Intercept
Deriving Knowledge from Data at Scale
deseasonalize
• Reseasonalize
Deriving Knowledge from Data at Scale
Deseasonalize
Forecast
Reseasonalize
Actual data Deseasonalized data
Example (SI + Regression)
Deriving Knowledge from Data at Scale12
Deriving Knowledge from Data at Scale
123
Year
Sales
($ mil.)
2002 7
2003 10
2004 9
2005 11
2006 13
Linear Trend – Using the Least Squares Method: An Example
Year t
Sales
($ mil.)
2002 1 7
2003 2 10
2004 3 9
2005 4 11
2006 5 13
Deriving Knowledge from Data at Scale
124
Deriving Knowledge from Data at Scale12
Deriving Knowledge from Data at Scale
126
(Excel function:
=log(x) or log(x,10)
Deriving Knowledge from Data at Scale12
Deriving Knowledge from Data at Scale
SI
SI
Deriving Knowledge from Data at Scale
year
raw index
quarter
four
Deriving Knowledge from Data at Scale130
The table below shows the quarterly sales for Toys International for the years 2001 through
2006. The sales are reported in millions of dollars. Determine a quarterly seasonal index
using the ratio-to-moving-average method.
Seasonal Index – An Example
Deriving Knowledge from Data at Scale13
Deriving Knowledge from Data at Scale13
Deriving Knowledge from Data at Scale13
Deriving Knowledge from Data at Scale
deseasonalize
Deriving Knowledge from Data at Scale
Deriving Knowledge from Data at Scale
• Class Discussions 15 minutes
• Forecasting, continued (2/2) 45 minutes
• Weka Tutorial 30 minutes
• Break 10 minutes
• Decision Trees and Random Forests 60 minutes
• Hands on, decision tree in Weka 15 minutes
Deriving Knowledge from Data at Scale
That’s all for tonight….

Mais conteúdo relacionado

Mais procurados

CRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsCRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsMichał Łopuszyński
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNiko Vuokko
 
From Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DMFrom Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DMMichał Łopuszyński
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning ProjectEng Teong Cheah
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Simplilearn
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsSri Ambati
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Edureka!
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceMark West
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientistPoo Kuan Hoong
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningTamir Taha
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —swethaT16
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
AI Orange Belt - Session 4
AI Orange Belt - Session 4AI Orange Belt - Session 4
AI Orange Belt - Session 4AI Black Belt
 
AI Orange Belt - Session 1
AI Orange Belt - Session 1AI Orange Belt - Session 1
AI Orange Belt - Session 1AI Black Belt
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksBICA Labs
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 

Mais procurados (20)

Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
CRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsCRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining Projects
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
From Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DMFrom Raw Data to Deployed Product. Fast & Agile with CRISP-DM
From Raw Data to Deployed Product. Fast & Agile with CRISP-DM
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data Science
 
The path to be a data scientist
The path to be a data scientistThe path to be a data scientist
The path to be a data scientist
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Ml intro
Ml introMl intro
Ml intro
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
AI Orange Belt - Session 4
AI Orange Belt - Session 4AI Orange Belt - Session 4
AI Orange Belt - Session 4
 
AI Orange Belt - Session 1
AI Orange Belt - Session 1AI Orange Belt - Session 1
AI Orange Belt - Session 1
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 

Destaque

Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9Roger Barga
 
PARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANAS
PARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANASPARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANAS
PARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANASdegatitos
 
Lo Stretto digitale - 0
Lo Stretto digitale - 0Lo Stretto digitale - 0
Lo Stretto digitale - 0anucita
 
Diploma a pachi
Diploma a pachiDiploma a pachi
Diploma a pachidegatitos
 
Отчет куратора Школы, 2015 год
Отчет куратора Школы, 2015 годОтчет куратора Школы, 2015 год
Отчет куратора Школы, 2015 годnizhgma.ru
 
Francesco Micali : Dal sito internet al network diocesano - Mediabeta srl
Francesco Micali : Dal sito internet al network diocesano - Mediabeta srlFrancesco Micali : Dal sito internet al network diocesano - Mediabeta srl
Francesco Micali : Dal sito internet al network diocesano - Mediabeta srlf.micali
 

Destaque (10)

Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
PARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANAS
PARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANASPARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANAS
PARA QUE SEPAS QUIÉN TE CANTA POR LAS MAÑANAS
 
Lo Stretto digitale - 0
Lo Stretto digitale - 0Lo Stretto digitale - 0
Lo Stretto digitale - 0
 
Supply Chain Planning Paper
Supply Chain Planning PaperSupply Chain Planning Paper
Supply Chain Planning Paper
 
Diploma a pachi
Diploma a pachiDiploma a pachi
Diploma a pachi
 
Отчет куратора Школы, 2015 год
Отчет куратора Школы, 2015 годОтчет куратора Школы, 2015 год
Отчет куратора Школы, 2015 год
 
Francesco Micali : Dal sito internet al network diocesano - Mediabeta srl
Francesco Micali : Dal sito internet al network diocesano - Mediabeta srlFrancesco Micali : Dal sito internet al network diocesano - Mediabeta srl
Francesco Micali : Dal sito internet al network diocesano - Mediabeta srl
 
Present progressive
Present progressivePresent progressive
Present progressive
 
World Cancer Day Awareness In UAE, 2017
World Cancer Day Awareness In UAE, 2017World Cancer Day Awareness In UAE, 2017
World Cancer Day Awareness In UAE, 2017
 
Proyecto final
Proyecto final Proyecto final
Proyecto final
 

Semelhante a Barga Data Science lecture 2

Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needGibDevs
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)SayyedYusufali
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and PlacementAkhilGGM
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabadsaitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placementSaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science trainingDIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabadVamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in HyderabadKumarNaik21
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 

Semelhante a Barga Data Science lecture 2 (20)

Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 

Mais de Roger Barga

RS Barga STRATA'18 New York City
RS Barga STRATA'18 New York CityRS Barga STRATA'18 New York City
RS Barga STRATA'18 New York CityRoger Barga
 
Barga Strata'18 presentation
Barga Strata'18 presentationBarga Strata'18 presentation
Barga Strata'18 presentationRoger Barga
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8Roger Barga
 
Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7Roger Barga
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5Roger Barga
 
Barga Data Science lecture 3
Barga Data Science lecture 3Barga Data Science lecture 3
Barga Data Science lecture 3Roger Barga
 
Barga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteBarga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteRoger Barga
 

Mais de Roger Barga (8)

RS Barga STRATA'18 New York City
RS Barga STRATA'18 New York CityRS Barga STRATA'18 New York City
RS Barga STRATA'18 New York City
 
Barga Strata'18 presentation
Barga Strata'18 presentationBarga Strata'18 presentation
Barga Strata'18 presentation
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8
 
Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5
 
Barga Data Science lecture 3
Barga Data Science lecture 3Barga Data Science lecture 3
Barga Data Science lecture 3
 
Barga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteBarga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 Keynote
 

Último

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Barga Data Science lecture 2

  • 1. Deriving Knowledge from Data at Scale
  • 2. Deriving Knowledge from Data at Scale An Introduction to Statistical Learning with Applications in R The Elements of Statistical Learning A Programmer’s Guide to Data Mining Probabilistic Programming & Bayesian Methods for Hackers Think Bayes, Bayesian Statistics Made Simple
  • 3. Deriving Knowledge from Data at Scale Data Mining and Analysis, Fundamental Concepts and Algorithms An Introduction to Data Science Machine Learning Machine Learning – The Complete Guide
  • 4. Deriving Knowledge from Data at Scale Bayesian Reasoning and Machine Learning A Course in Machine Learning Information Theory, Inference and Learning Algorithms Modeling with Data Mining of Massive Datasets Information Theory, Inference and Learning Algorithms
  • 5. Deriving Knowledge from Data at Scale
  • 6. Deriving Knowledge from Data at Scale • Introduction to machine learning • Overview of the machine learning process • Introduction to select algorithms • Introduction to concepts we will examine later in course: bias/variance, generalization, underfitting, overfitting, ensemble methods, etc. • Elements of a time series • Functions to manipulate time series • Lay a foundation for Prediction and Forecasting…
  • 7. Deriving Knowledge from Data at Scale http://www.cs.waikato.ac.nz/ml/weka/ install by next class, get the developer version…
  • 8. Deriving Knowledge from Data at Scale • Class Discussions 20 minutes • Machine Learning Primer 60 minutes • Break 15 minutes • Time Series & Forecasting, 1/2 60 minutes • Closing discussion 15 minutes
  • 9. Deriving Knowledge from Data at Scale so we need teams
  • 10. Deriving Knowledge from Data at Scale Discussion on Assigned Reading… Week One
  • 11. Deriving Knowledge from Data at Scale The Data Science Workflow Lecture 1 Review Stay in the immediate zone during exploratory modeling, to extent possible: • 5 to 10 minutes per experiment, results in 100’s per day; • Small, statistically sound/relevant samples; • Linear modelling during feature exploration; • Don’t write ML algorithms, use packages, do learn to write data manipulation code; Start with a sample of data, as soon as you can get it… • You will learn a considerable amount from that first data sample • Quality of the data, what fields are being collected, possible missing values and/or missing fields, and the questions will begin to flow (dialogue with customer will be at a much more meaningful level). If your customer can’t quantify it, you can’t change/improve it…
  • 12. Deriving Knowledge from Data at Scale The Data Science Workflow Loops within loops…
  • 13. Deriving Knowledge from Data at Scale The Data Science Workflow Define the Goal The first task in a data science project is to define a measurable & quantifiable goal. At this stage, learn all that you can about the context of your project. • Why do the project sponsors want the project in the first place? What do they lack, and what do they need? • What are they doing to solve the problem now, and why isn't that good enough? • What resources will you need: what kind of data, how much staff, will you have domain experts to collaborate with, what are the computational resources? • How do the project sponsors plan to deploy your results? What are the constraints that have to be met for successful deployment? Not We want to get better at finding bad loans. But We want to reduce our rate of loan charge-offs by at least 10%, using a model that predicts which loan applicants are likely to default.
  • 14. Deriving Knowledge from Data at Scale A concrete goal begets concrete stopping conditions and concrete acceptance criteria. The less specific the goal, the likelier that the project will go unbounded, because no result will be "good enough." If you don't know what you want to achieve, you don't know when to stop trying – or even what to try. When the project eventually terminates – because either time or resources run out – no one will be happy with the outcome…
  • 15. Deriving Knowledge from Data at Scale The Data Science Workflow Data Collection and Management This step encompasses identifying the data you need, exploring it, and conditioning it to be suitable for analysis. This stage is often the most time- consuming step in the process. It's also one of the most important. • What data is available to me? • Will it help me solve the problem? • Is it enough? • Is it of good enough quality? Rule First piece of data is very informative, data set utility is roughly logarithmic in size. Prefer Direct measurements, but if they are not available identify proxy variables.
  • 16. Deriving Knowledge from Data at Scale
  • 17. Deriving Knowledge from Data at Scale The Data Science Workflow Modeling We get to statistics and machine learning during the modeling stage. Here is where you try to extract useful insights from the data. Many modeling procedures make specific assumptions about data distribution and relationships, there will be back-and- forth between the modeling and data cleaning stages as you try to find the best way to represent and model the data. The most common data science modeling tasks are: • Classification: deciding if something belongs to one category or another. • Scoring: predicting or estimating a numeric value such as a price or probability. • Ranking: learning to order items by preferences. • Clustering: grouping items into most similar groups. • Finding Relations: finding correlations or potential causes of effects seen in the data. • Characterization: very general plotting and report generation from data.
  • 18. Deriving Knowledge from Data at Scale The Data Science Workflow Model Evaluation and Critique Once you have a model, you need to determine if it meets your goals. • Is it accurate enough for your needs? • Does it generalize well? • Does it perform better than "the obvious guess"? Better than whatever estimate you currently use? • Do the results of the model (coefficients, clusters, rules) make sense in the context of the problem domain? If you've answered "no" to any of the above questions, it's time to loop back to the modeling step — or decide that the data doesn't support the goal you are trying to achieve.
  • 19. Deriving Knowledge from Data at Scale The Data Science Workflow Model Deployment and Maintenance Finally, the model is put into operation. In many organizations this means the data scientist no longer has primary responsibility for the day-to-day operation of the model. However, you still should ensure that the model will run smoothly and will not make disastrous unsupervised decisions. You also want to make sure that the model can be updated as its environment changes. And in many situations, the model will initially be deployed in a small pilot program. The test might bring out issues that you didn't anticipate, and you will have to adjust the model appropriately.
  • 20. Deriving Knowledge from Data at Scale The Data Science Workflow Presentation and Documentation Once you have a model that meets your success criteria, you will present your results to your project sponsor and other stakeholders. You must also document the model for those in the organization who are responsible for using, running, and maintaining the model once it has been deployed. Model interpretability may be an issue…
  • 21. Deriving Knowledge from Data at Scale The Data Science Workflow Loops within loops…
  • 22. Deriving Knowledge from Data at Scale • Class Discussions 30 minutes • Machine Learning Primer 60 minutes • Break 15 minutes • Time Series & Forecasting 60 minutes • Closing discussion 15 minutes
  • 23. Deriving Knowledge from Data at Scale
  • 24.
  • 25. 1 1 5 4 3 7 5 3 5 3 5 5 9 0 6 3 5 2 0 0
  • 26.
  • 27.
  • 28.
  • 29. training data (expensive) synthetic training data (cheaper)
  • 30. solve hard problems value from Big Data human intelligence
  • 31. Machine learning enables nearly every value proposition of web search.
  • 32. Hundreds of thousands of machines… Hundreds of metrics and signals per machine… Which signals correlate with the real cause of a problem? How can we extract effective repair actions?
  • 33. solve hard problems value from Big Data human intelligence business analytics
  • 34. Predicting future performance from historical data Recommenda- tion engines Advertising analysis Weather forecasting for business planning Social network analysis IT infrastructure and web app optimization Legal discovery and document archiving Pricing analysis Fraud detection Churn analysis Equipment monitoring Location-based tracking and services Personalized Insurance Predictive analytics address the likelihood of something happening in the future, even if it is just an instant later…
  • 35. object encoded with features (think DB attributes/ OO member fields of primitive types) 𝑑 is the feature dimensionality. classifier prediction (response/dependent variable). Can be qualitative/quantitative (classification/regression). 𝑶𝒃𝒋𝒆𝒄𝒕 → 𝑶𝒖𝒕𝒄𝒐𝒎𝒆 Entity → Category Entity → PopularityComplex decision making: 𝑿 → 𝒀 𝑿 = (𝒙 𝟎, … , 𝒙 𝒅) → 𝒀 input/independent variable We may know the relation for certain values of 𝑋 and 𝑌: In fact, we may know the relation for many 𝒙s and 𝑦s: 𝒙, 𝑦 𝒙 𝟏 , 𝑦 𝟏 , … , 𝒙 𝑵 , 𝑦 𝑵 The 𝑖-th 𝑥 is: 𝒙(𝑖) = 𝑥0 (𝑖) , … , 𝑥 𝑑 (𝑖)
  • 36. TRAINING Input 𝒙 = (𝑥 𝟎, … , 𝑥 𝒅) Online System object encoded with features classifier prediction (response/dependent variable) Final Output 𝑦 Model Offline Training Sub-system Training Data 𝒙 𝟏 , 𝑦 𝟏 , … , 𝒙 𝑵 , 𝑦 𝑵 where 𝒙(𝒊) = 𝑥 𝟎 (𝒊) , … , 𝑥 𝒅 (𝒊) 𝑓 𝑋 = 𝑌𝑋 → 𝑌 Task is very complex . Hard to construct good 𝑓. We construct an approximation to 𝑓: 𝑔(𝑋) ≈ 𝑌 Hypothesis space: 𝑔(𝑋) ∈ 𝐻.
  • 37. Deriving Knowledge from Data at Scale The decision is driven by both the nature of your data and the question you are trying to answer
  • 38. Deriving Knowledge from Data at Scale Out of Class Reading Week Two
  • 39. gender age smoker eye color male 19 yes green female 44 yes gray male 49 yes blue male 12 no brown female 37 no brown female 60 no brown male 44 no blue female 27 yes brown female 51 yes green female 81 yes gray male 22 yes brown male 29 no blue lung cancer no yes yes no no yes no no yes no no no male 77 yes gray male 19 yes green female 44 no gray ? ? ?
  • 40. gender age smoker eye color male 19 yes green female 44 yes gray male 49 yes blue male 12 no brown female 37 no brown female 60 no brown male 44 no blue female 27 yes brown female 51 yes green female 81 yes gray male 22 yes brown male 29 no blue lung cancer no yes yes no no yes no no yes no no no male 77 yes gray male 19 yes green female 44 no gray ? ? ? Train ML Model
  • 41. gender age smoker eye color male 19 yes green female 44 yes gray male 49 yes blue male 12 no brown female 37 no brown female 60 no brown male 44 no blue female 27 yes brown female 51 yes green female 81 yes gray male 22 yes brown male 29 no blue lung cancer no yes yes no no yes no no yes no no no male 77 yes gray male 19 yes green female 44 no gray yes no no Train ML Model
  • 42. Always focuses on a bias-variance decomposition We have a lecture dedicated to evaluation metrics for models.
  • 43.
  • 44.
  • 45. key easily the most important factor
  • 46. key CITY 1 LAT. CITY 1 LNG. CITY 2 LAT. CITY 2 LNG. DRIVABLE? 123.24 46.71 121.33 47.34 Yes 123.24 56.91 121.33 55.23 Yes 123.24 46.71 121.33 55.34 No 123.24 46.71 130.99 47.34 No
  • 48. More data wins Once you’ve defined your input fields, there’s only so much analytic gymnastics you can do. Computer algorithms trying to learn models have only a relatively few tricks they can do efficiently, and many of them are not so very different. Performance differences between algorithms are typically not large. Thus, if you want better classifiers: 1. Engineer better features 2. Get your hands on more high-quality data
  • 49. The power of ensembles models vote for the final prediction
  • 50. Occam’s Razor smaller faster to fit more interpretable
  • 52. observational data can only show us that two variables are related, but it cannot tell us the “why” Freakonomics tended to have higher standardized test scores
  • 53. Deriving Knowledge from Data at Scale Break, 5 minutes…
  • 54.
  • 55. Deriving Knowledge from Data at Scale boosting)
  • 56. Deriving Knowledge from Data at Scale
  • 57. Deriving Knowledge from Data at Scale Ensemble Classification
  • 58. Deriving Knowledge from Data at Scale Let’s look at the Netflix Prize Competition…
  • 59. Deriving Knowledge from Data at Scale Began October 2006
  • 60. Deriving Knowledge from Data at Scale
  • 61. Deriving Knowledge from Data at Scale from http://www.research.att.com/~volinsky/netflix/ However, improvement slowed…
  • 62. Deriving Knowledge from Data at Scale Today, the top team has posted a 8.5% improvement. Ensemble methods are the best performers…
  • 63. Deriving Knowledge from Data at Scale “Thanks to Paul Harrison's collaboration, a simple mix of our solutions improved our result from 6.31 to 6.75” Rookies
  • 64. Deriving Knowledge from Data at Scale “My approach is to combine the results of many methods (also two- way interactions between them) using linear regression on the test set. The best method in my ensemble is regularized SVD with biases, post processed with kernel ridge regression” Arek Paterek http://rainbow.mimuw.edu.pl/~ap/ap_kdd.pdf
  • 65. Deriving Knowledge from Data at Scale “When the predictions of multiple RBM models and multiple SVD models are linearly combined, we achieve an error rate that is well over 6% better than the score of Netflix’s own system.” U of Toronto http://www.cs.toronto.edu/~rsalakhu/papers/rbmcf.pdf
  • 66. Deriving Knowledge from Data at Scale Gravity home.mit.bme.hu/~gtakacs/download/gravity.pdf
  • 67. Deriving Knowledge from Data at Scale “Our common team blends the result of team Gravity and team Dinosaur Planet.” Might have guessed from the name… When Gravity and Dinosaurs Unite
  • 68. Deriving Knowledge from Data at Scale And, yes, the top team which is from AT&T… “Our final solution (RMSE=0.8712) consists of blending 107 individual results. “ BellKor / KorBell
  • 69. Deriving Knowledge from Data at Scale
  • 70. Deriving Knowledge from Data at Scale • 83.7% majority vote accuracy • 99.9% majority vote accuracy Intuitions
  • 71. Deriving Knowledge from Data at Scale Boosting Bagging
  • 72. Deriving Knowledge from Data at Scale Leo Breiman
  • 73. Deriving Knowledge from Data at Scale • Resampl few data • Resample the minority data skew
  • 74. Deriving Knowledge from Data at Scale N=10
  • 75. Deriving Knowledge from Data at Scale
  • 76. Deriving Knowledge from Data at Scale
  • 77. Deriving Knowledge from Data at Scale
  • 78. Deriving Knowledge from Data at Scale Break, 5 minutes…
  • 79. Deriving Knowledge from Data at Scale Time Series & Forecasting, 1/2
  • 80. Deriving Knowledge from Data at Scale median filter, a windowing technique that moves through the data point-by-point, and replaces it with the median value calculated for the current window…
  • 81. Deriving Knowledge from Data at Scale Visual inspection reveals smoothing of the outliers without dampening the naturally occurring peaks and troughs (no signal loss). Prior to smoothing, we could see no correlation in our data, but afterwards, Spearman’s Rho was ~0.5 for almost all parameters.
  • 82. Deriving Knowledge from Data at Scale ? ? ?
  • 83. Deriving Knowledge from Data at Scale 3.7 12.5 9.0
  • 84. Deriving Knowledge from Data at Scale Forecasting Quantitative Causal Model Trend Time series Stationary Trend Trend + Seasonality Qualitative Expert Judgment Delphi Method Grassroots
  • 85. Deriving Knowledge from Data at Scale Year 1 Year 2 Year 3 Year 4 Demandforproductorservice
  • 86. Deriving Knowledge from Data at Scale Year 1 Year 2 Year 3 Year 4 Demandforproductorservice Trend component Actual demand line Seasonal peaks Random variation
  • 87. Deriving Knowledge from Data at Scale87
  • 88. Deriving Knowledge from Data at Scale88
  • 89. Deriving Knowledge from Data at Scale • Naïve • Moving Average • Exponential Smoothing • Regression
  • 90. Deriving Knowledge from Data at Scale
  • 91. Deriving Knowledge from Data at Scale 91
  • 92. Deriving Knowledge from Data at Scale92 smoothing time n A+...+A+A+A =F 1n-t2-t1-tt 1t   Ft+1 = Forecast for the upcoming period, t+1 n = Number of periods to be averaged A t = Actual occurrence in period t
  • 93. Deriving Knowledge from Data at Scale 3 n A+...+A+A+A =F 1n-t2-t1-tt 1t   Month Sales (000) 1 4 2 6 3 5 4 ? 5 ? 6 ?
  • 94. Deriving Knowledge from Data at Scale Month Sales (000) Moving Average (n=3) 1 4 NA 2 6 NA 3 5 NA 4 ? 5 ? (4+6+5)/3=5 6 ? n A+...+A+A+A =F 1n-t2-t1-tt 1t   You’re manager in Amazon’s electronics department. You want to forecast ipod sales for months 4-6 using a 3-period moving average.
  • 95. Deriving Knowledge from Data at Scale Month Sales (000) Moving Average (n=3) 1 4 NA 2 6 NA 3 5 NA 4 3 5 ? 5 6 ? ?
  • 96. Deriving Knowledge from Data at Scale Month Sales (000) Moving Average (n=3) 1 4 NA 2 6 NA 3 5 NA 4 3 5 ? 5 6 ? (6+5+3)/3=4.667
  • 97. Deriving Knowledge from Data at Scale Month Sales (000) Moving Average (n=3) 1 4 NA 2 6 NA 3 5 NA 4 3 5 7 5 6 ? 4.667?
  • 98. Deriving Knowledge from Data at Scale Month Sales (000) Moving Average (n=3) 1 4 NA 2 6 NA 3 5 NA 4 3 5 7 5 6 ? 4.667 (5+3+7)/3=5
  • 99. Deriving Knowledge from Data at Scale99
  • 100. Deriving Knowledge from Data at Scale 100
  • 101. Deriving Knowledge from Data at Scale10
  • 102. Deriving Knowledge from Data at Scale Month Weighted Moving Average 1 4 NA 2 6 NA 3 5 NA 4 31/6 = 5.167 5 6 ? ? ? 1n-tn2-t31-t2t11t Aw+...+Aw+Aw+Aw=F  Sales (000)
  • 103. Deriving Knowledge from Data at Scale Month Sales (000) Weighted Moving Average 1 4 NA 2 6 NA 3 5 NA 4 3 31/6 = 5.167 5 7 6 25/6 = 4.167 32/6 = 5.333 1n-tn2-t31-t2t11t Aw+...+Aw+Aw+Aw=F 
  • 104. Deriving Knowledge from Data at Scale10
  • 105. Deriving Knowledge from Data at Scale10
  • 106. Deriving Knowledge from Data at Scale 106 simple any • Previous time previous actual and forecasted value
  • 107. Deriving Knowledge from Data at Scale • gives more weight to recent time periods Ft+1 = Ft + a(At - Ft) et Ft+1 = Forecast value for time t+1 At = Actual value at time t a = Smoothing constant Need initial forecast Ft to start.
  • 108. Deriving Knowledge from Data at Scale Week Demand 1 820 2 775 3 680 4 655 5 750 6 802 7 798 8 689 9 775 10 Given the weekly demand data what are the exponential smoothing forecasts for periods 2-10 using a=0.10? Assume F1=D1 Ft+1 = Ft + a(At - Ft) i Ai
  • 109. Deriving Knowledge from Data at Scale Week Demand 0.1 0.6 1 820 820.00 820.00 2 775 820.00 820.00 3 680 815.50 793.00 4 655 801.95 725.20 5 750 787.26 683.08 6 802 783.53 723.23 7 798 785.38 770.49 8 689 786.64 787.00 9 775 776.88 728.20 10 776.69 756.28 Ft+1 = Ft + a(At - Ft) a = F2 = F1+ a(A1–F1) =820+.1(820–820) =820 i Ai Fi
  • 110. Deriving Knowledge from Data at Scale Week Demand 0.1 0.6 1 820 820.00 820.00 2 775 820.00 820.00 3 680 815.50 793.00 4 655 801.95 725.20 5 750 787.26 683.08 6 802 783.53 723.23 7 798 785.38 770.49 8 689 786.64 787.00 9 775 776.88 728.20 10 776.69 756.28 Ft+1 = Ft + a(At - Ft) a = F3 = F2+ a(A2–F2) =820+.1(775–820) =815.5 i Ai Fi
  • 111. Deriving Knowledge from Data at Scale Week Demand 0.1 0.6 1 820 820.00 820.00 2 775 820.00 820.00 3 680 815.50 793.00 4 655 801.95 725.20 5 750 787.26 683.08 6 802 783.53 723.23 7 798 785.38 770.49 8 689 786.64 787.00 9 775 776.88 728.20 10 776.69 756.28 Ft+1 = Ft + a(At - Ft) This process continues through week 10 a = i Ai Fi
  • 112. Deriving Knowledge from Data at Scale Week Demand 0.1 0.6 1 820 820.00 820.00 2 775 820.00 820.00 3 680 815.50 793.00 4 655 801.95 725.20 5 750 787.26 683.08 6 802 783.53 723.23 7 798 785.38 770.49 8 689 786.64 787.00 9 775 776.88 728.20 10 776.69 756.28 Ft+1 = Ft + a(At - Ft) What if the a constant equals 0.6 a = a = i Ai Fi
  • 113. Deriving Knowledge from Data at Scale α • depends on the emphasis you want to place on the most recent data α
  • 114. Deriving Knowledge from Data at Scale Ft+1 = a At + a(1- a) At - 1 + a(1- a)2At - 2 + ... Weights Prior Period a 2 periods ago a(1 - a) 3 periods ago a(1 - a)2 a= a= 0.10 a= 0.90 10% 9% 8.1% 90% 9% 0.9% Ft+1 = Ft + a (At - Ft) or w1 w2 w3
  • 115. Deriving Knowledge from Data at Scale 115
  • 116. Deriving Knowledge from Data at Scale Trend Seasonal
  • 117. Deriving Knowledge from Data at Scale11
  • 118. Deriving Knowledge from Data at Scale 0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10
  • 119. Deriving Knowledge from Data at Scale b0 b1 0 2 4 6 8 10 12 10 11 12 13 14 15 16 17 18 19 20 Best line! Intercept
  • 120. Deriving Knowledge from Data at Scale deseasonalize • Reseasonalize
  • 121. Deriving Knowledge from Data at Scale Deseasonalize Forecast Reseasonalize Actual data Deseasonalized data Example (SI + Regression)
  • 122. Deriving Knowledge from Data at Scale12
  • 123. Deriving Knowledge from Data at Scale 123 Year Sales ($ mil.) 2002 7 2003 10 2004 9 2005 11 2006 13 Linear Trend – Using the Least Squares Method: An Example Year t Sales ($ mil.) 2002 1 7 2003 2 10 2004 3 9 2005 4 11 2006 5 13
  • 124. Deriving Knowledge from Data at Scale 124
  • 125. Deriving Knowledge from Data at Scale12
  • 126. Deriving Knowledge from Data at Scale 126 (Excel function: =log(x) or log(x,10)
  • 127. Deriving Knowledge from Data at Scale12
  • 128. Deriving Knowledge from Data at Scale SI SI
  • 129. Deriving Knowledge from Data at Scale year raw index quarter four
  • 130. Deriving Knowledge from Data at Scale130 The table below shows the quarterly sales for Toys International for the years 2001 through 2006. The sales are reported in millions of dollars. Determine a quarterly seasonal index using the ratio-to-moving-average method. Seasonal Index – An Example
  • 131. Deriving Knowledge from Data at Scale13
  • 132. Deriving Knowledge from Data at Scale13
  • 133. Deriving Knowledge from Data at Scale13
  • 134. Deriving Knowledge from Data at Scale deseasonalize
  • 135. Deriving Knowledge from Data at Scale
  • 136. Deriving Knowledge from Data at Scale • Class Discussions 15 minutes • Forecasting, continued (2/2) 45 minutes • Weka Tutorial 30 minutes • Break 10 minutes • Decision Trees and Random Forests 60 minutes • Hands on, decision tree in Weka 15 minutes
  • 137. Deriving Knowledge from Data at Scale That’s all for tonight….