SlideShare uma empresa Scribd logo
1 de 105
Machine learning -
A summer approach
July 21 2016
Introduction
Árpád Fülöp arpad.fulop@balabit.com
László Kovács laszlo.kovacs@balabit.com
What to expect
To have an introduction to machine learning by walking through the steps
of a real project as an example.
1. What does a data scientist do
2. Basic ideas in machine learning
3. Demo: a machine learning application
4. Data preparation for predictive modeling
5. Building predictive models: Decision trees and forests
6. Validating predictive models
7. Measuring performance of predictive models
Part 1
What does a data scientist do
(http://www.marketingdistillery.com/2014/11/29/is-data-science-a-buzzword-modern-data-scientist-defined/)
An interdisciplinary job
Prediction
The ability to make reliable predictions about future events by using the
patterns seen in historical data.
Examples:
- Which one of my customers will end their contract
based on their mobile phone usage data?
- Given the friendship graph of my users,
what new connections are likely to be made?
Anomaly detection
Uncovering unusual events, potential frauds by noticing deviation of the data
from what is normal.
Examples:
- It could be suspicious if a customer suddenly
consumes much less power than it is usual for
them according to the data from the meters.
- By knowing typically which user issues which
commands, I am able to recognize weird and
outlying operations on a computer.
Gaining insights
Extracting hidden connections, knowledge about our customers, products,
business processes.
Examples:
- Based on the data about their visits, we can
discover typical segments of users and observe
in which aspect they use our web site similarly
- We crawl Twitter for thousands of user
feedbacks and learn the general sentiment and
emotions towards about our company
Making valid decisions
The possibility to validate business related hypotheses or comparing
alternatives in a mathematical sense.
Examples:
- Will the subscription rate drop if we change the text
used in my email marketing campaign?
- How do I redesign my web page to maximize the time
spent by the visitors?
1) Define an experiment. 2) Measure the results on a
sample. 3) Infer the properties of the whole population.
What do we do?
We are building a data driven IT security product.
The software aims to find anomalies in IT security related system logs.
The behavior of the users of the IT system is analyzed and if unusual
behavior is detected, alerts are raised.
This helps the work if IT security experts of a company by drawing their
attention on the most important events in the system.
What is Big Data?
Whatever this is
What is Big Data?
From a technical point of view:
“a term for data sets that are so large or complex that traditional data processing
applications are inadequate” (Wikipedia) -> Infrastructure-wise Big Data
From a layman’s point of view:
“extremely large data sets that may be analysed computationally to reveal
patterns, trends, and associations, especially relating to human behaviour and
interactions” (Oxford Dictionary) -> Impact-wise Big Data
What is Big Data?
The 3 Vs:
- Volume the data to be processed takes GBs, TBs of space
- Velocity new data comes frequently at a high speed
- Variety the data has a variety of formats and cannot be stored in
tabular (relational) form
(Gartner)
Data scientist vs Big Data
Data scientists
Professionals who
deal with big data
Part 2
Basic ideas in machine learning
Types of learning
Understanding (meaningful learning):
What you learn converts to understanding of the concept.
Your knowledge is general: you can apply your it to new situations.
Memorizing (rote learning):
What you learn can be quickly recalled, but is superficial and cannot
be applied in another context.
Computers and memorization
Computers themselves are the best to accurately store and recall huge
amounts of data: documents, dictionaries, bits of video files, etc.
But this is solely memorizing. Do they understand what’s going on?
If a student is memorizing a question
bank before the exam without
understanding a word, she might pass an
exam containing the same questions, but
fails when answering new questions.
Machine learning
Machine learning is about making computers to be able to learn through
examples.
To goal is to, after having seen many examples, finding such patterns that
can be generalized so well that they can be used in future situations.
Can a student actually gain understanding
seeing questions and their answers?
Meaningful machine learning
As data scientist, while using machine learning as a tool, our most
important task is to prevent memorizing (or overfitting in this
context) because we want to use the acquired knowledge for new
examples in the future.
Although the machine will not “understand” the
data ever,
we can motivate our algorithms to find trends,
correlation structures, and connections.
A machine processes (learns from) more data than a human; it can deal
with amounts of data that we cannot.
With machines, learning can be automated;
machines deal with repetitive tasks more easily than humans.
The patterns found by the machines will never be perfect,
but given enough examples of appropriate quality and quantity, they will
be useful.
Nature of machine learning
When to use machine learning
If the following conditions hold:
1) There is a pattern to be learned; a pattern between the questions
(inputs) and answers (output)
2) We cannot formulate the pattern mathematically
3) We have enough data (examples) for learning
(Abu-Mostafa: Learning from Data)
When not to use machine learning
To find out, for instance:
- The winning numbers of next week’s lottery (no pattern)
- The area of a triangle (can be formulated)
- The time of the next financial crisis (not enough data)
Learning game (Abu-Mostafa: Learning from Data)
label = “0”
label = “1”
label = ?
Learning game (Abu-Mostafa: Learning from Data)
Takeaways:
- There is no single solution but there are many possible
ones
- The amount of learning examples during learning raises our
confidence about our solution
Key aspects to consider in a machine learning
task
Data What are the examples, how do we get them?
Unit of observation What is considered one example?
Observed features What attributes do we store about an example?
Observed target variable What is the attribute we want to be able to
predict?
Outcome What is the meaning of the predicted target
variable?
Business Case How can we use the predictions?
Predict if an employee wants to quit
Data Personal, work-related data from HR database
Unit of observation One employee
Observed features Overtime, effectiveness, patterns in days-off and
sick-days, commuting time, etc.
Observed target variable Who quit in the past?
Outcome What are the chances of someone quitting?
Business Case Prevent quitting by focused countermeasures, eg.
mentoring.
Predicting flight delays
Data Air traffic data from airport systems
Unit of observation A single flight from A to B
Observed features Origin, destination, airline, day of year, weather
Observed target variable Delay in minutes
Outcome Prediction of punctuality
Business Case What are the expected loss on delays?
Biometric authentication with mouse dynamics
Data Server logs about user sessions
Unit of observation A single movement of a mouse cursor from A to B
Observed features Length, straightness, speed
Observed target variable The username of the user
Outcome Anomaly level of a user session
Business Case Improved security with automatic alerts
Classify mood of music
Data 500 mp3 files
Unit of observation Song in mp3
Observed features ?
Observed target variable Manually defined labels either “cheerful” or “blue”
Outcome ?
Business Case ?
How to represent in data table format
Feature #1 Feature #2 ...
Target
variable
...
...
Examples,
data point
Headers
Features
(observed attributes) Observed target
variable
Part 3
Demo: a machine learning application
Part 4
Data preparation for predictive modeling
Data preprocessing
Raw data
:(
Data ready to
be analyzed
:)
Data representations
Join data tables
Character encoding Aggregations
Pivoting
Parsing raw data
Date formats
REPRODUCIBLE
PROCESS
The data source: Remote Desktop connection
logs
Architecture
Raw data of a session
record timestamp client timestamp button state x y
1434623080.316000 4053743.247000 NoButton Move 686 281
1434623080.419000 4053743.357000 NoButton Move 687 287
1434623080.615000 4053743.559000 Left Pressed 687 287
1434623080.745000 4053743.684000 Left Released 687 287
1434623081.557000 4053744.495000 NoButton Move 690 288
1434623081.667000 4053744.605000 NoButton Move 742 300
… (some 10k lines)
Heatmap
The target variable: what is the goal of the
analysis?
Feature #1 Feature #2 ...
Target
variable
...
Examples,
data point
...
Features
(observed attributes) Observed target
variable
The examples: was would be appropriate as
an example?
Feature #1 Feature #2 ...
Made by
user?
1
1
0
0
...
...
Examples,
data point
...
Features
(observed attributes) Observed target
variable
Gesture
A gesture: moving the cursor from one point to another in one go.
- Large enough to capture the mouse moving characteristics of a user,
- Small enough to have a lot of them to learn from.
A possible definition of a gesture:
We process the raw file from the beginning row-by-row. At each step,
if the time difference is larger than 0.3 sec, or a mouse button is
pressed, the current gesture ends and a new one starts.
Gesture extraction
record timestamp time difference button state x y
1434623080.316000 - NoButton Move 686 281
1434623080.419000 0.103 NoButton Move 687 287
1434623080.615000 0.196 NoButton Move 687 287
1434623080.745000 0.130 NoButton Move 687 287
1434623081.557000 0.812 NoButton Move 690 288
1434623081.667000 0.110 NoButton Move 742 300
1434623081.877000 0.210 NoButton Move 748 300
Gesture
#1
Gesture
#2
The features: what are appropriate features of
a gesture?
Feature #1 Feature #2 ...
Made by
user?
1
1
0
0
...
...
Gestures
...
Features
(observed attributes) Observed target
variable
Feature engineering
What properties of gestured can be defined that might be useful in
differentiating between users?
*CLICK*
ts0, x0, y0
ts1, x1, y1
tsn, xn, yn
Feature engineering
What properties of gestured can be defined that might be useful in
differentiating between users?
*CLICK*
ts0, x0, y0
ts1, x1, y1
tsn, xn, yn
Duration: tsn - ts0
Path length: sum of distances between consecutive points
Avg. speed: path length / duration
Time to click: time spent between last move and click (if any)
Mean/std/etc. of (consecutive) speed/acceleration/etc. values
Also: angles
The data set is now ready to be analyzed
Avg speed
(pixel/sec)
Duration
(sec)
...
Made by
user?
34.5 5 1
12.1 3 1
1.23 12 0
55.9 3 0
... ... ...
...
Gestures
...
Features
(observed attributes) Observed target
variable
Part 5
Building predictive models:
Decision trees and forests
Outline of predictive modeling
We have many observations about a certain event, process etc. Each
observation is a pair of several features and a target variable.
With a learning algorithm and our data we aim to build a predictive
model that learns what is the typical value of the target for any
combination of feature values.
We then can use the model for predicting the value of the target of
(new) observations solely based on their features.
Example: wine prices
Rain during
harvest
(mm)
mean
temperature
in May (℃)
... Price (€)
18 5 2.5
200 4 16
180 10 250
100 2 9.5
... ... ...
...
...
Examples,
data point
Headers
Features
(observed attributes) Observed target
variable
Example: the Titanic data set
sex fare (£) ... survived
male 200 1
female 40 1
female 150 1
male 40 0
... ... ...
...
...
Examples,
data point
Headers
Features
(observed attributes) Observed target
variable
Prediction problems
The two main types of prediction problems are:
- Classification: the target variable is a categorical variable (e.g.,
yes/no decision, letters to be recognized)
- Regression: the target variable is a continuous variable (e.g., age,
income, stock prices)
Either for classification or regression, there are hundreds of learning
algorithms to choose from. Picking one is a problem itself, and influences
the success of the project.
Example of regression (1D)
Each blue point is an observation. We have to build a model that can tell
the income based on the age of the client.
Monthlyincome(y)
Age (x)
15 25 35 45 55 65 75
1000
2000
Example of regression (1D)
The task translates to fitting a curve to the points that we see!
Monthlyincome(y)
Age (x)
15 25 35 45 55 65 75
1000
2000
Example of regression (1D)
1st solution: connecting the dots.
Monthlyincome(y)
Age (x)
15 25 35 45 55 65 75
1000
2000
Example of regression (1D)
2nd solution: draw a straight line through the points.
Monthlyincome(y)
Age (x)
15 25 35 45 55 65 75
1000
2000
Example of regressionMonthlyincome(y)
Age (x)
15 25 35 45 55 65 75
1000
2000
Which one is the better?
Example of regressionMonthlyincome(y)
Age (x)
15 25 35 45 55 65 75
1000
2000
New clients have arrived (red); let’s see how the model performs!
Decision tree
We make predictions about the target (y) by answering
questions about the features (x1
, …, xn
).
An answer to a question either leads to a next question or directly to a
prediction.
We store the series of decisions in a tree structure. The leaves contain the
predictions. Each node that is not a leaf contains a question.
Example: Titanic decision tree
Male?
Age >= 10?
Family members on
ship >= 3?
survives
survivesdies
dies
yes no
yes no
yes no
Building a tree (2D)
Let us build a decision tree to decide whether an article on a news portal will
be popular or not! We have two features: # of photos, # of paragraphs.
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
Cut the space based on # of paragraphs.
# paragraphs
> 10?
10
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
Cut the space based on # of paragraphs.
# paragraphs
> 10?
not
popular
yes
10
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
The rest is not homogeneous enough; we proceed with cutting.
# paragraphs
> 10?
not
popular
yesno
10
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
Cut the rest of the space based on # of photos.
# paragraphs
> 10?
not
popular
yesno
# photos > 6?
10
6
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
Cut the rest of the space based on # of photos.
# paragraphs
> 10?
not
popular
yesno
# photos > 6?
10
6
yes
popular
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
The rest is not homogeneous enough; we proceed with cutting.
# paragraphs
> 10?
not
popular
yesno
# photos > 6?
10
6
2
yes
popular
no
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
Cut again based on # of paragraphs.
# paragraphs
> 10?
not
popular
yesno
# photos > 6?
10
6
2
noyes
popular # paragraphs
< 2?
# paragraphs
#photos
: popular : not popular
Building a tree (2D)
Cut again based on # of paragraphs.
# paragraphs
> 10?
not
popular
yesno
# photos > 6?
10
6
2
noyes
popular # paragraphs
< 2?
yes
not popular# paragraphs
#photos
: popular : not popular
Building a tree (2D)
The rest is homogeneous enough; we stop cutting the space.
# paragraphs
> 10?
not
popular
yesno
# photos > 6?
10
6
2
noyes
popular # paragraphs
< 2?
noyes
not popular popular# paragraphs
#photos
: popular : not popular
Using tree for prediction
What will be the popularity of a new article according to the tree?
# paragraphs
> 10?
not
popular
yesno
# photos > 6?
# paragraphs
#photos
10
6
: popular : not popular
2
noyes
popular # paragraphs
< 2?
noyes
not popular popular
What if we do not stop cutting the space?
Take this task as an example.
What if we do not stop cutting the space?
We cut the space to fully homogeneous areas.
What if we do not stop cutting the space?
You see that red area in the middle of a large blue one?
What if we do not stop cutting the space?
You see that red area in the middle of a large blue one? It is more like the
result of “getting lost in the details” than seeing the true trend.
If a new point that is in fact blue,
accidentally falls there, it will be
classified as red.
Stopping criteria
A couple of rules for trees to prevent getting lost in the
details, i.e., growing too large, i.e., overfitting:
- They cannot have more levels than x,
- We do not cut areas with less than x points,
- We do not cut areas if one of the new areas would
have less than x points,
- We do not cut areas that are homogeneous enough
(as measured by entropy, Gini index etc.)
Random forests
In a forest there are several independent trees. Each tree grows seeing a
different random part of the whole data set.
When making prediction, the prediction of the forest is voted by the trees.
?
?
?
?
?
?
?
?
?
?
...
Aggregating the gesture-level predictions
After learning, the forest can predict if a gesture was legal or not.
Avg. speed
(pixel/sec)
Duration
(sec)
...
Made by
user?
(prediction)
34.5 5 1
12.1 3 1
1.23 12 0
55.9 3 0
... ... ...
Aggregating the gesture-level predictions
We need to make a decision about a whole session for a user!
For this, we aggregate the predictions for the gestures in a whole session.
Avg. speed
(pixel/sec)
Duration
(sec)
...
Made by
user?
(prediction)
34.5 5 1
12.1 3 1
1.23 12 0
55.9 3 0
... ... ...
- If < 0.5, the session is
regarded as illegal
- If > 0.5, the session is
regarded as legal
Average
Part 6
Validating predictive models
Overfitting is bad, what to do about it?
We are afraid of the more complex models but we need them!
How should we decide the amount of complexity which is JUST ENOUGH?
● A good model fits on the known examples (obviously) but also fits on unseen
examples
● That is the point: predicting the outcome of unseen examples is similar to
predicting examples from the future
● We can simulate having new examples by slicing the known dataset into two
parts:
○ Training dataset: examples only for training
○ Test dataset: examples only for measuring performance
Validation
Feature #1 Feature #2 ... Target
Every
known
example
Feature #1 Feature #2 ... Target
Every
known
example
Feature #1 Feature #2 ... Target
Training set
Test set
Validation
How it is done?
One can increase the complexity of the learning model as long as the
goodness of fit on the UNSEEN data increases.
● The goodness of fit on the training set will increase until full overfitting
● The goodness of fit on the test set will increase but just to a certain
point
We can visualize it with the learning curve.
The learning curve
“The goodness of fit on the training set will increase until full overfitting.”
That is, the error will decrease on the training set until full overfitting.
Amount of complexity we allow
Training dataset
Errorofmodel
“The goodness of fit on the test set will increase but just to a certain point.”
That is, the error on the test set will decrease but just to a certain point.
Amount of complexity we allow
The learning curve
Unseen test
dataset
Training dataset
Errorofmodel
The learning curve
After the optimal point every “bit of knowledge” the model gains, is not
general, but just data-specific knowledge about the particular training
dataset it sees.
Errorofmodel
Amount of complexity we allow
Training dataset
Unseen test
dataset
Optimal
complexity
Validation
There are some techniques (e.g., cross validation) which try to eliminate this loss of
information by selecting different parts of the known data as training sets, and then
aggregating the results of these different scenarios.
We sacrifice some data (and potentially
information) but we gain objective,
measurable knowledge about how well
our model will perform “out there”.
Part 7
Measuring performance of
predictive models
Measuring performance
What do we mean exactly by “goodness of fit”?
We would like to have minor differences between the predictions and the
real value of the target attributes from the test data set.
If our problem is regression (the truth is a continuous variable):
● Add up all the differences between the prediction and the truth for all
examples;
● The smaller the sum, the better our model.
● Exact match is rare, but a close guess is usable.
● E.g.: RMSE, root of mean squared error
Measuring performance
What do we mean exactly by “goodness of fit”?
We would like to have minor differences between the predictions and the
real value of the target attributes from the test data set.
If our problem is classification:
● If the predicted class misses the true class, there is no magnitude of
error. Not correct is not correct. (There is no “slightly pregnant
woman”.)
● Counting the rate of the correct predictions seems like a good idea,
but it is not a great one.
What can a classifier model do?
Not so many things, considering two classes, namely: “positive” and “negative”:
Predicts “Positive” when the reality is “Positive”
Predicts “Positive” when the reality is “Negative”
Predicts “Negative” when the reality is “Negative”
Predicts “Negative” when the reality is “Positive”
Let’s make a small table
If we rearrange the smiley faces:
True Positive
True Negative
False Positive
False Negative
False
Pos.
False
Neg.
True
Neg.
+ -
+
Reality
Prediction
-
True
Pos.
The confusion matrix catches ‘em all.
(The most important 2-by-2 matrix in machine learning.)
Let’s make a small table
With a perfect model:
5 0
0 5
+ -
+
Reality
Prediction
-
● 5 positive and 5 negative cases in the
dataset to be predicted.
● Every prediction is correct.
Let’s make a small table
1
1 4
+ -
+
Reality
Prediction
-
4
A more realistic scenario:
● 5 positive and 5 negative cases in the
dataset to be predicted
● There is one misclassified case for
each class.
Accuracy = calculate the rate of the correctly classified cases.
5
5 5
+ -
+
Reality
Prediction
-
985
With the confusion matrix:
sum(blue cells) / sum(all cells) = 990/1000 = 99%
Is this a good model?
Note that in a case like this, the model is likely to be
used to spot the NEGATIVE events. (Those are the
rare, interesting cases.)
This particular model has an awful performance on
those cases. Half of them are mis-classified!
How was that comment on measuring
performance?
Some performance measures...
There are several other methods which use the values of the confusion matrix in
order to evaluate a classification model.
The method needs to be chosen carefully for the purpose of the application.
Many classifiers don’t give strict verdicts
Though the target variable might be a discrete variable (orange/green), in practise
the classifier models are giving class-probabilities back (e.g., X% chance of
being green).
0% 100%
This means that one can decide which probability is high enough to predict a
particular label. If a music song seems to be 95% “cheerful” it’s a safer bet, than
one which is 52% cheerful.
Legend:
Color: the true class of a particular event known from the test dataset
Position: the probability of being in the green class as estimated by the model.
Many classifiers don’t give strict verdicts
Though the target variable might be a discrete variable (orange/green), in practise
the classifier models are giving class-probabilities back (e.g., X% chance of
being green).
0% 100%
The good news: We have a much more detailed view on how the model works,
and on the amount of confidence it has about each prediction it makes.
The bad news: In order to retrieve discrete predictions the user must decide how
to transform the probabilities into classes, i.e, define a probability threshold which
separates the classes.
Many classifiers don’t give strict verdicts
Though the target variable might be a discrete variable (orange/green), in practise
the classifier models are giving class-probabilities back (e.g., X% chance of
being green).
0% 100%
Legend:
Color: the true class of a particular event known from the test dataset
Position: the probability of being in the green class as estimated by the model.
This is an amazing model! We can find a point in the middle, which separates the
points into two groups, which are 100% the same as the original two categories.
Remember… we don’t have perfect models :(
What should we do when our model outputs something more realistic, like this:
0% 100%
Legend:
Color: the true class of a particular event known from the test dataset
Position: the probability of being in the green class as estimated by the model.
In the two sides, the picture is clear. But there are some borderline cases, where
there are some confusion.
2 questions arise:
● How should we find a good threshold?
● How to evaluate a model, without a pre-defined threshold?
Finding a threshold depends on the application, and the problem domain itself, and
has little to do with machine learning.
A threshold with low false positive rate is needed before applying a risky treatment.
A threshold with low false negative rate is needed before blood transfusion.
How should we find a good threshold?
- Towards the left hand side, we classify every green correctly but misclassify a lot of
oranges as greens. This means a lots of false positives.
- Towards the right hand side, we classify every orange correctly, but misclassify a lot of
greens as oranges. This means a lots of false negatives.
0% 100%
A B
0% 100%
A B
How to evaluate without a pre-defined threshold?
ALL
Every application have to deal with the
false positive - false negative trade-off,
and they deal with it differently.
Regardless of the application, we have to
be able to tell if a model is better than
another, objectively.
Why not to compute the false positives
and false negatives for EVERY threshold,
and have a look at a particular model, by
considering these different scenarios?
ROC curve (Receiver Operating Characteristic)
Every point on the red curve is
showing the corresponding rate of
false positives and true positives for
a particular threshold.
The dotted line is a random model.
The further the red line is from the
dotted line the better the model.
Howmanytruepositives
atathreshold?
How many false positives
at a threshold?
0% 100%
0%
100%
ROC curve (Receiver Operating Characteristic)
AUC = “Area Under Curve”
The bigger the area under the red line,
the better the model.
The area under the dotted line: 0.5
The perfect model: 1.0
“You can decrease the false-positive rate
to 0, and in that process, you don’t
generate any false negatives.”
0% 100%
0%
100%
Howmanytruepositives
atathreshold?
How many false positives
at a threshold?
Wrap up
- Data science as a field is big and diverse, machine learning is a key
tool to master
- Given enough examples machines can learn
- Learning is more complex than memorizing
- A great effort is needed to prepare the examples (features and target)
- The bigger challenge is not fitting a model, but to avoid overfitting
- Several key decisions have to be made, after a model has been
constructed

Mais conteúdo relacionado

Mais procurados

What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —swethaT16
 
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using HadoopAnalysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using HadoopIRJET Journal
 
Barga, roger. predictive analytics with microsoft azure machine learning
Barga, roger. predictive analytics with microsoft azure machine learningBarga, roger. predictive analytics with microsoft azure machine learning
Barga, roger. predictive analytics with microsoft azure machine learningmaldonadojorge
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningEng Teong Cheah
 
Machine Learning Final presentation
Machine Learning Final presentation Machine Learning Final presentation
Machine Learning Final presentation AyanaRukasar
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Studyvivatechijri
 
modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches Venkat Projects
 
Classification with R
Classification with RClassification with R
Classification with RNajima Begum
 
Machine learning - AI
Machine learning - AIMachine learning - AI
Machine learning - AIWitekio
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteRoger Barga
 
How ml can improve purchase conversions
How ml can improve purchase conversionsHow ml can improve purchase conversions
How ml can improve purchase conversionsSudeep Shukla
 
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...Simplilearn
 
Sentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewSentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewAbdullah Moin
 
Application Of Python in Medical Science
Application Of Python in Medical ScienceApplication Of Python in Medical Science
Application Of Python in Medical ScienceAditya Nag
 
Introduction to Machine Learning & AI
Introduction to Machine Learning & AIIntroduction to Machine Learning & AI
Introduction to Machine Learning & AIMichael Eydman
 

Mais procurados (19)

What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
 
1. introduction to data science —
1. introduction to data science —1. introduction to data science —
1. introduction to data science —
 
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using HadoopAnalysis and Prediction of Sentiments for Cricket Tweets using Hadoop
Analysis and Prediction of Sentiments for Cricket Tweets using Hadoop
 
Barga, roger. predictive analytics with microsoft azure machine learning
Barga, roger. predictive analytics with microsoft azure machine learningBarga, roger. predictive analytics with microsoft azure machine learning
Barga, roger. predictive analytics with microsoft azure machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning Final presentation
Machine Learning Final presentation Machine Learning Final presentation
Machine Learning Final presentation
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Study
 
modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches
 
Classification with R
Classification with RClassification with R
Classification with R
 
Machine learning - AI
Machine learning - AIMachine learning - AI
Machine learning - AI
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
 
Machine learning
Machine learningMachine learning
Machine learning
 
How ml can improve purchase conversions
How ml can improve purchase conversionsHow ml can improve purchase conversions
How ml can improve purchase conversions
 
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
 
Sentiment Analysis Using Product Review
Sentiment Analysis Using Product ReviewSentiment Analysis Using Product Review
Sentiment Analysis Using Product Review
 
Application Of Python in Medical Science
Application Of Python in Medical ScienceApplication Of Python in Medical Science
Application Of Python in Medical Science
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to Machine Learning & AI
Introduction to Machine Learning & AIIntroduction to Machine Learning & AI
Introduction to Machine Learning & AI
 

Semelhante a Machine learning at b.e.s.t. summer university

Machine Learning
Machine Learning Machine Learning
Machine Learning AyanGain
 
MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxssuser28b150
 
Unit I and II Machine Learning MCA CREC.pptx
Unit I and II Machine Learning MCA CREC.pptxUnit I and II Machine Learning MCA CREC.pptx
Unit I and II Machine Learning MCA CREC.pptxtrishipaul
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...ijdpsjournal
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...ijdpsjournal
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data ScientistRohit Dubey
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsArpana Awasthi
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learningSandeep Garg
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfTemok IT Services
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...eswaralaldevadoss
 
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfIntroduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfmallikarjuntalakal
 
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfIntroduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfikenossama03
 
Regression and correlation
Regression and correlationRegression and correlation
Regression and correlationVrushaliSolanke
 
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersMixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersUXPA International
 
UXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataUXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataZachary Sam Zaiss
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopCCG
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big dataPoo Kuan Hoong
 

Semelhante a Machine learning at b.e.s.t. summer university (20)

Machine Learning
Machine Learning Machine Learning
Machine Learning
 
MB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptxMB2208A- Business Analytics- unit-4.pptx
MB2208A- Business Analytics- unit-4.pptx
 
Machine learning
Machine learningMachine learning
Machine learning
 
Unit I and II Machine Learning MCA CREC.pptx
Unit I and II Machine Learning MCA CREC.pptxUnit I and II Machine Learning MCA CREC.pptx
Unit I and II Machine Learning MCA CREC.pptx
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data Scientist
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
 
Data Science for Finance Interview.
Data Science for Finance Interview. Data Science for Finance Interview.
Data Science for Finance Interview.
 
Credit card fraud detection using python machine learning
Credit card fraud detection using python machine learningCredit card fraud detection using python machine learning
Credit card fraud detection using python machine learning
 
what-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdfwhat-is-machine-learning-and-its-importance-in-todays-world.pdf
what-is-machine-learning-and-its-importance-in-todays-world.pdf
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
 
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfIntroduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdf
 
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfIntroduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdf
 
Regression and correlation
Regression and correlationRegression and correlation
Regression and correlation
 
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX ResearchersMixed Methods Research in the Age of Big Data: A Primer for UX Researchers
Mixed Methods Research in the Age of Big Data: A Primer for UX Researchers
 
UXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big DataUXPA 2016: Mixed Methods Research in the Age of Big Data
UXPA 2016: Mixed Methods Research in the Age of Big Data
 
Machine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual WorkshopMachine Learning with Azure and Databricks Virtual Workshop
Machine Learning with Azure and Databricks Virtual Workshop
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 

Último

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 

Último (20)

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 

Machine learning at b.e.s.t. summer university

  • 1. Machine learning - A summer approach July 21 2016
  • 3. What to expect To have an introduction to machine learning by walking through the steps of a real project as an example. 1. What does a data scientist do 2. Basic ideas in machine learning 3. Demo: a machine learning application 4. Data preparation for predictive modeling 5. Building predictive models: Decision trees and forests 6. Validating predictive models 7. Measuring performance of predictive models
  • 4. Part 1 What does a data scientist do
  • 5.
  • 7. Prediction The ability to make reliable predictions about future events by using the patterns seen in historical data. Examples: - Which one of my customers will end their contract based on their mobile phone usage data? - Given the friendship graph of my users, what new connections are likely to be made?
  • 8. Anomaly detection Uncovering unusual events, potential frauds by noticing deviation of the data from what is normal. Examples: - It could be suspicious if a customer suddenly consumes much less power than it is usual for them according to the data from the meters. - By knowing typically which user issues which commands, I am able to recognize weird and outlying operations on a computer.
  • 9. Gaining insights Extracting hidden connections, knowledge about our customers, products, business processes. Examples: - Based on the data about their visits, we can discover typical segments of users and observe in which aspect they use our web site similarly - We crawl Twitter for thousands of user feedbacks and learn the general sentiment and emotions towards about our company
  • 10. Making valid decisions The possibility to validate business related hypotheses or comparing alternatives in a mathematical sense. Examples: - Will the subscription rate drop if we change the text used in my email marketing campaign? - How do I redesign my web page to maximize the time spent by the visitors? 1) Define an experiment. 2) Measure the results on a sample. 3) Infer the properties of the whole population.
  • 11. What do we do? We are building a data driven IT security product. The software aims to find anomalies in IT security related system logs. The behavior of the users of the IT system is analyzed and if unusual behavior is detected, alerts are raised. This helps the work if IT security experts of a company by drawing their attention on the most important events in the system.
  • 12. What is Big Data? Whatever this is
  • 13. What is Big Data? From a technical point of view: “a term for data sets that are so large or complex that traditional data processing applications are inadequate” (Wikipedia) -> Infrastructure-wise Big Data From a layman’s point of view: “extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions” (Oxford Dictionary) -> Impact-wise Big Data
  • 14. What is Big Data? The 3 Vs: - Volume the data to be processed takes GBs, TBs of space - Velocity new data comes frequently at a high speed - Variety the data has a variety of formats and cannot be stored in tabular (relational) form (Gartner)
  • 15. Data scientist vs Big Data Data scientists Professionals who deal with big data
  • 16. Part 2 Basic ideas in machine learning
  • 17. Types of learning Understanding (meaningful learning): What you learn converts to understanding of the concept. Your knowledge is general: you can apply your it to new situations. Memorizing (rote learning): What you learn can be quickly recalled, but is superficial and cannot be applied in another context.
  • 18. Computers and memorization Computers themselves are the best to accurately store and recall huge amounts of data: documents, dictionaries, bits of video files, etc. But this is solely memorizing. Do they understand what’s going on? If a student is memorizing a question bank before the exam without understanding a word, she might pass an exam containing the same questions, but fails when answering new questions.
  • 19. Machine learning Machine learning is about making computers to be able to learn through examples. To goal is to, after having seen many examples, finding such patterns that can be generalized so well that they can be used in future situations. Can a student actually gain understanding seeing questions and their answers?
  • 20. Meaningful machine learning As data scientist, while using machine learning as a tool, our most important task is to prevent memorizing (or overfitting in this context) because we want to use the acquired knowledge for new examples in the future. Although the machine will not “understand” the data ever, we can motivate our algorithms to find trends, correlation structures, and connections.
  • 21. A machine processes (learns from) more data than a human; it can deal with amounts of data that we cannot. With machines, learning can be automated; machines deal with repetitive tasks more easily than humans. The patterns found by the machines will never be perfect, but given enough examples of appropriate quality and quantity, they will be useful. Nature of machine learning
  • 22. When to use machine learning If the following conditions hold: 1) There is a pattern to be learned; a pattern between the questions (inputs) and answers (output) 2) We cannot formulate the pattern mathematically 3) We have enough data (examples) for learning (Abu-Mostafa: Learning from Data)
  • 23. When not to use machine learning To find out, for instance: - The winning numbers of next week’s lottery (no pattern) - The area of a triangle (can be formulated) - The time of the next financial crisis (not enough data)
  • 24. Learning game (Abu-Mostafa: Learning from Data) label = “0” label = “1” label = ?
  • 25. Learning game (Abu-Mostafa: Learning from Data) Takeaways: - There is no single solution but there are many possible ones - The amount of learning examples during learning raises our confidence about our solution
  • 26. Key aspects to consider in a machine learning task Data What are the examples, how do we get them? Unit of observation What is considered one example? Observed features What attributes do we store about an example? Observed target variable What is the attribute we want to be able to predict? Outcome What is the meaning of the predicted target variable? Business Case How can we use the predictions?
  • 27. Predict if an employee wants to quit Data Personal, work-related data from HR database Unit of observation One employee Observed features Overtime, effectiveness, patterns in days-off and sick-days, commuting time, etc. Observed target variable Who quit in the past? Outcome What are the chances of someone quitting? Business Case Prevent quitting by focused countermeasures, eg. mentoring.
  • 28. Predicting flight delays Data Air traffic data from airport systems Unit of observation A single flight from A to B Observed features Origin, destination, airline, day of year, weather Observed target variable Delay in minutes Outcome Prediction of punctuality Business Case What are the expected loss on delays?
  • 29. Biometric authentication with mouse dynamics Data Server logs about user sessions Unit of observation A single movement of a mouse cursor from A to B Observed features Length, straightness, speed Observed target variable The username of the user Outcome Anomaly level of a user session Business Case Improved security with automatic alerts
  • 30. Classify mood of music Data 500 mp3 files Unit of observation Song in mp3 Observed features ? Observed target variable Manually defined labels either “cheerful” or “blue” Outcome ? Business Case ?
  • 31. How to represent in data table format Feature #1 Feature #2 ... Target variable ... ... Examples, data point Headers Features (observed attributes) Observed target variable
  • 32. Part 3 Demo: a machine learning application
  • 33. Part 4 Data preparation for predictive modeling
  • 34. Data preprocessing Raw data :( Data ready to be analyzed :) Data representations Join data tables Character encoding Aggregations Pivoting Parsing raw data Date formats REPRODUCIBLE PROCESS
  • 35. The data source: Remote Desktop connection logs
  • 37. Raw data of a session record timestamp client timestamp button state x y 1434623080.316000 4053743.247000 NoButton Move 686 281 1434623080.419000 4053743.357000 NoButton Move 687 287 1434623080.615000 4053743.559000 Left Pressed 687 287 1434623080.745000 4053743.684000 Left Released 687 287 1434623081.557000 4053744.495000 NoButton Move 690 288 1434623081.667000 4053744.605000 NoButton Move 742 300 … (some 10k lines)
  • 39. The target variable: what is the goal of the analysis? Feature #1 Feature #2 ... Target variable ... Examples, data point ... Features (observed attributes) Observed target variable
  • 40. The examples: was would be appropriate as an example? Feature #1 Feature #2 ... Made by user? 1 1 0 0 ... ... Examples, data point ... Features (observed attributes) Observed target variable
  • 41. Gesture A gesture: moving the cursor from one point to another in one go. - Large enough to capture the mouse moving characteristics of a user, - Small enough to have a lot of them to learn from. A possible definition of a gesture: We process the raw file from the beginning row-by-row. At each step, if the time difference is larger than 0.3 sec, or a mouse button is pressed, the current gesture ends and a new one starts.
  • 42. Gesture extraction record timestamp time difference button state x y 1434623080.316000 - NoButton Move 686 281 1434623080.419000 0.103 NoButton Move 687 287 1434623080.615000 0.196 NoButton Move 687 287 1434623080.745000 0.130 NoButton Move 687 287 1434623081.557000 0.812 NoButton Move 690 288 1434623081.667000 0.110 NoButton Move 742 300 1434623081.877000 0.210 NoButton Move 748 300 Gesture #1 Gesture #2
  • 43. The features: what are appropriate features of a gesture? Feature #1 Feature #2 ... Made by user? 1 1 0 0 ... ... Gestures ... Features (observed attributes) Observed target variable
  • 44. Feature engineering What properties of gestured can be defined that might be useful in differentiating between users? *CLICK* ts0, x0, y0 ts1, x1, y1 tsn, xn, yn
  • 45. Feature engineering What properties of gestured can be defined that might be useful in differentiating between users? *CLICK* ts0, x0, y0 ts1, x1, y1 tsn, xn, yn Duration: tsn - ts0 Path length: sum of distances between consecutive points Avg. speed: path length / duration Time to click: time spent between last move and click (if any) Mean/std/etc. of (consecutive) speed/acceleration/etc. values Also: angles
  • 46. The data set is now ready to be analyzed Avg speed (pixel/sec) Duration (sec) ... Made by user? 34.5 5 1 12.1 3 1 1.23 12 0 55.9 3 0 ... ... ... ... Gestures ... Features (observed attributes) Observed target variable
  • 47. Part 5 Building predictive models: Decision trees and forests
  • 48. Outline of predictive modeling We have many observations about a certain event, process etc. Each observation is a pair of several features and a target variable. With a learning algorithm and our data we aim to build a predictive model that learns what is the typical value of the target for any combination of feature values. We then can use the model for predicting the value of the target of (new) observations solely based on their features.
  • 49. Example: wine prices Rain during harvest (mm) mean temperature in May (℃) ... Price (€) 18 5 2.5 200 4 16 180 10 250 100 2 9.5 ... ... ... ... ... Examples, data point Headers Features (observed attributes) Observed target variable
  • 50. Example: the Titanic data set sex fare (£) ... survived male 200 1 female 40 1 female 150 1 male 40 0 ... ... ... ... ... Examples, data point Headers Features (observed attributes) Observed target variable
  • 51. Prediction problems The two main types of prediction problems are: - Classification: the target variable is a categorical variable (e.g., yes/no decision, letters to be recognized) - Regression: the target variable is a continuous variable (e.g., age, income, stock prices) Either for classification or regression, there are hundreds of learning algorithms to choose from. Picking one is a problem itself, and influences the success of the project.
  • 52. Example of regression (1D) Each blue point is an observation. We have to build a model that can tell the income based on the age of the client. Monthlyincome(y) Age (x) 15 25 35 45 55 65 75 1000 2000
  • 53. Example of regression (1D) The task translates to fitting a curve to the points that we see! Monthlyincome(y) Age (x) 15 25 35 45 55 65 75 1000 2000
  • 54. Example of regression (1D) 1st solution: connecting the dots. Monthlyincome(y) Age (x) 15 25 35 45 55 65 75 1000 2000
  • 55. Example of regression (1D) 2nd solution: draw a straight line through the points. Monthlyincome(y) Age (x) 15 25 35 45 55 65 75 1000 2000
  • 56. Example of regressionMonthlyincome(y) Age (x) 15 25 35 45 55 65 75 1000 2000 Which one is the better?
  • 57. Example of regressionMonthlyincome(y) Age (x) 15 25 35 45 55 65 75 1000 2000 New clients have arrived (red); let’s see how the model performs!
  • 58. Decision tree We make predictions about the target (y) by answering questions about the features (x1 , …, xn ). An answer to a question either leads to a next question or directly to a prediction. We store the series of decisions in a tree structure. The leaves contain the predictions. Each node that is not a leaf contains a question.
  • 59. Example: Titanic decision tree Male? Age >= 10? Family members on ship >= 3? survives survivesdies dies yes no yes no yes no
  • 60. Building a tree (2D) Let us build a decision tree to decide whether an article on a news portal will be popular or not! We have two features: # of photos, # of paragraphs. # paragraphs #photos : popular : not popular
  • 61. Building a tree (2D) Cut the space based on # of paragraphs. # paragraphs > 10? 10 # paragraphs #photos : popular : not popular
  • 62. Building a tree (2D) Cut the space based on # of paragraphs. # paragraphs > 10? not popular yes 10 # paragraphs #photos : popular : not popular
  • 63. Building a tree (2D) The rest is not homogeneous enough; we proceed with cutting. # paragraphs > 10? not popular yesno 10 # paragraphs #photos : popular : not popular
  • 64. Building a tree (2D) Cut the rest of the space based on # of photos. # paragraphs > 10? not popular yesno # photos > 6? 10 6 # paragraphs #photos : popular : not popular
  • 65. Building a tree (2D) Cut the rest of the space based on # of photos. # paragraphs > 10? not popular yesno # photos > 6? 10 6 yes popular # paragraphs #photos : popular : not popular
  • 66. Building a tree (2D) The rest is not homogeneous enough; we proceed with cutting. # paragraphs > 10? not popular yesno # photos > 6? 10 6 2 yes popular no # paragraphs #photos : popular : not popular
  • 67. Building a tree (2D) Cut again based on # of paragraphs. # paragraphs > 10? not popular yesno # photos > 6? 10 6 2 noyes popular # paragraphs < 2? # paragraphs #photos : popular : not popular
  • 68. Building a tree (2D) Cut again based on # of paragraphs. # paragraphs > 10? not popular yesno # photos > 6? 10 6 2 noyes popular # paragraphs < 2? yes not popular# paragraphs #photos : popular : not popular
  • 69. Building a tree (2D) The rest is homogeneous enough; we stop cutting the space. # paragraphs > 10? not popular yesno # photos > 6? 10 6 2 noyes popular # paragraphs < 2? noyes not popular popular# paragraphs #photos : popular : not popular
  • 70. Using tree for prediction What will be the popularity of a new article according to the tree? # paragraphs > 10? not popular yesno # photos > 6? # paragraphs #photos 10 6 : popular : not popular 2 noyes popular # paragraphs < 2? noyes not popular popular
  • 71. What if we do not stop cutting the space? Take this task as an example.
  • 72. What if we do not stop cutting the space? We cut the space to fully homogeneous areas.
  • 73. What if we do not stop cutting the space? You see that red area in the middle of a large blue one?
  • 74. What if we do not stop cutting the space? You see that red area in the middle of a large blue one? It is more like the result of “getting lost in the details” than seeing the true trend. If a new point that is in fact blue, accidentally falls there, it will be classified as red.
  • 75. Stopping criteria A couple of rules for trees to prevent getting lost in the details, i.e., growing too large, i.e., overfitting: - They cannot have more levels than x, - We do not cut areas with less than x points, - We do not cut areas if one of the new areas would have less than x points, - We do not cut areas that are homogeneous enough (as measured by entropy, Gini index etc.)
  • 76. Random forests In a forest there are several independent trees. Each tree grows seeing a different random part of the whole data set. When making prediction, the prediction of the forest is voted by the trees. ? ? ? ? ? ? ? ? ? ? ...
  • 77. Aggregating the gesture-level predictions After learning, the forest can predict if a gesture was legal or not. Avg. speed (pixel/sec) Duration (sec) ... Made by user? (prediction) 34.5 5 1 12.1 3 1 1.23 12 0 55.9 3 0 ... ... ...
  • 78. Aggregating the gesture-level predictions We need to make a decision about a whole session for a user! For this, we aggregate the predictions for the gestures in a whole session. Avg. speed (pixel/sec) Duration (sec) ... Made by user? (prediction) 34.5 5 1 12.1 3 1 1.23 12 0 55.9 3 0 ... ... ... - If < 0.5, the session is regarded as illegal - If > 0.5, the session is regarded as legal Average
  • 80. Overfitting is bad, what to do about it? We are afraid of the more complex models but we need them! How should we decide the amount of complexity which is JUST ENOUGH? ● A good model fits on the known examples (obviously) but also fits on unseen examples ● That is the point: predicting the outcome of unseen examples is similar to predicting examples from the future ● We can simulate having new examples by slicing the known dataset into two parts: ○ Training dataset: examples only for training ○ Test dataset: examples only for measuring performance
  • 81. Validation Feature #1 Feature #2 ... Target Every known example
  • 82. Feature #1 Feature #2 ... Target Every known example Feature #1 Feature #2 ... Target Training set Test set Validation
  • 83. How it is done? One can increase the complexity of the learning model as long as the goodness of fit on the UNSEEN data increases. ● The goodness of fit on the training set will increase until full overfitting ● The goodness of fit on the test set will increase but just to a certain point We can visualize it with the learning curve.
  • 84. The learning curve “The goodness of fit on the training set will increase until full overfitting.” That is, the error will decrease on the training set until full overfitting. Amount of complexity we allow Training dataset Errorofmodel
  • 85. “The goodness of fit on the test set will increase but just to a certain point.” That is, the error on the test set will decrease but just to a certain point. Amount of complexity we allow The learning curve Unseen test dataset Training dataset Errorofmodel
  • 86. The learning curve After the optimal point every “bit of knowledge” the model gains, is not general, but just data-specific knowledge about the particular training dataset it sees. Errorofmodel Amount of complexity we allow Training dataset Unseen test dataset Optimal complexity
  • 87. Validation There are some techniques (e.g., cross validation) which try to eliminate this loss of information by selecting different parts of the known data as training sets, and then aggregating the results of these different scenarios. We sacrifice some data (and potentially information) but we gain objective, measurable knowledge about how well our model will perform “out there”.
  • 88. Part 7 Measuring performance of predictive models
  • 89. Measuring performance What do we mean exactly by “goodness of fit”? We would like to have minor differences between the predictions and the real value of the target attributes from the test data set. If our problem is regression (the truth is a continuous variable): ● Add up all the differences between the prediction and the truth for all examples; ● The smaller the sum, the better our model. ● Exact match is rare, but a close guess is usable. ● E.g.: RMSE, root of mean squared error
  • 90. Measuring performance What do we mean exactly by “goodness of fit”? We would like to have minor differences between the predictions and the real value of the target attributes from the test data set. If our problem is classification: ● If the predicted class misses the true class, there is no magnitude of error. Not correct is not correct. (There is no “slightly pregnant woman”.) ● Counting the rate of the correct predictions seems like a good idea, but it is not a great one.
  • 91. What can a classifier model do? Not so many things, considering two classes, namely: “positive” and “negative”: Predicts “Positive” when the reality is “Positive” Predicts “Positive” when the reality is “Negative” Predicts “Negative” when the reality is “Negative” Predicts “Negative” when the reality is “Positive”
  • 92. Let’s make a small table If we rearrange the smiley faces: True Positive True Negative False Positive False Negative False Pos. False Neg. True Neg. + - + Reality Prediction - True Pos. The confusion matrix catches ‘em all. (The most important 2-by-2 matrix in machine learning.)
  • 93. Let’s make a small table With a perfect model: 5 0 0 5 + - + Reality Prediction - ● 5 positive and 5 negative cases in the dataset to be predicted. ● Every prediction is correct.
  • 94. Let’s make a small table 1 1 4 + - + Reality Prediction - 4 A more realistic scenario: ● 5 positive and 5 negative cases in the dataset to be predicted ● There is one misclassified case for each class.
  • 95. Accuracy = calculate the rate of the correctly classified cases. 5 5 5 + - + Reality Prediction - 985 With the confusion matrix: sum(blue cells) / sum(all cells) = 990/1000 = 99% Is this a good model? Note that in a case like this, the model is likely to be used to spot the NEGATIVE events. (Those are the rare, interesting cases.) This particular model has an awful performance on those cases. Half of them are mis-classified! How was that comment on measuring performance?
  • 96. Some performance measures... There are several other methods which use the values of the confusion matrix in order to evaluate a classification model. The method needs to be chosen carefully for the purpose of the application.
  • 97. Many classifiers don’t give strict verdicts Though the target variable might be a discrete variable (orange/green), in practise the classifier models are giving class-probabilities back (e.g., X% chance of being green). 0% 100% This means that one can decide which probability is high enough to predict a particular label. If a music song seems to be 95% “cheerful” it’s a safer bet, than one which is 52% cheerful. Legend: Color: the true class of a particular event known from the test dataset Position: the probability of being in the green class as estimated by the model.
  • 98. Many classifiers don’t give strict verdicts Though the target variable might be a discrete variable (orange/green), in practise the classifier models are giving class-probabilities back (e.g., X% chance of being green). 0% 100% The good news: We have a much more detailed view on how the model works, and on the amount of confidence it has about each prediction it makes. The bad news: In order to retrieve discrete predictions the user must decide how to transform the probabilities into classes, i.e, define a probability threshold which separates the classes.
  • 99. Many classifiers don’t give strict verdicts Though the target variable might be a discrete variable (orange/green), in practise the classifier models are giving class-probabilities back (e.g., X% chance of being green). 0% 100% Legend: Color: the true class of a particular event known from the test dataset Position: the probability of being in the green class as estimated by the model. This is an amazing model! We can find a point in the middle, which separates the points into two groups, which are 100% the same as the original two categories.
  • 100. Remember… we don’t have perfect models :( What should we do when our model outputs something more realistic, like this: 0% 100% Legend: Color: the true class of a particular event known from the test dataset Position: the probability of being in the green class as estimated by the model. In the two sides, the picture is clear. But there are some borderline cases, where there are some confusion. 2 questions arise: ● How should we find a good threshold? ● How to evaluate a model, without a pre-defined threshold?
  • 101. Finding a threshold depends on the application, and the problem domain itself, and has little to do with machine learning. A threshold with low false positive rate is needed before applying a risky treatment. A threshold with low false negative rate is needed before blood transfusion. How should we find a good threshold? - Towards the left hand side, we classify every green correctly but misclassify a lot of oranges as greens. This means a lots of false positives. - Towards the right hand side, we classify every orange correctly, but misclassify a lot of greens as oranges. This means a lots of false negatives. 0% 100% A B
  • 102. 0% 100% A B How to evaluate without a pre-defined threshold? ALL Every application have to deal with the false positive - false negative trade-off, and they deal with it differently. Regardless of the application, we have to be able to tell if a model is better than another, objectively. Why not to compute the false positives and false negatives for EVERY threshold, and have a look at a particular model, by considering these different scenarios?
  • 103. ROC curve (Receiver Operating Characteristic) Every point on the red curve is showing the corresponding rate of false positives and true positives for a particular threshold. The dotted line is a random model. The further the red line is from the dotted line the better the model. Howmanytruepositives atathreshold? How many false positives at a threshold? 0% 100% 0% 100%
  • 104. ROC curve (Receiver Operating Characteristic) AUC = “Area Under Curve” The bigger the area under the red line, the better the model. The area under the dotted line: 0.5 The perfect model: 1.0 “You can decrease the false-positive rate to 0, and in that process, you don’t generate any false negatives.” 0% 100% 0% 100% Howmanytruepositives atathreshold? How many false positives at a threshold?
  • 105. Wrap up - Data science as a field is big and diverse, machine learning is a key tool to master - Given enough examples machines can learn - Learning is more complex than memorizing - A great effort is needed to prepare the examples (features and target) - The bigger challenge is not fitting a model, but to avoid overfitting - Several key decisions have to be made, after a model has been constructed