SlideShare uma empresa Scribd logo
1 de 50
Lecture 01
Introduction to Machine Learning
What is Machine Learning?
 All useful Programs “Learn” something.
 Early definition of Machine Learning
“ Field of study that gives computers the ability to learn without
being explicitly programmed” Arthur Samuel (1959).
 Computer pioneer who wrote first self-learning program,
which played checkers- learned from experience.
 Tom Mitchell(1997)- “Said to learn from experience with
respect to some class of tasks, and a performance measure P,
if[ the learner’s] performance at tasks in the class as measured
by P, improves with experience”
Contd…
Traditional Programming:
Machine Learning:
Machine Learning is the central part of some other
courses like Natural Language Processing, Computational
Biology, Computer Vision, robotics etc.
Computer
Data
Output
Program
Computer
Data
Program
Output
How are things learned?
 Memorization:
 Accumulation of individual facts
Limited by-
Time to observe facts
Memory to store facts
 Generalization:
 Deduce new facts from old facts
Limited by accuracy of deduction process
 Essentially a predictive activity
 Assumes that past predicts the future
 Interested in extending to programs that can infer useful
information from implicit patterns in data.
Declarative Knowledge
Imperative Knowledge
Basic Paradigm:
 Observe set of examples: Training data
e.g., Football Players, labeled by position with height and
weight data.
 Infer something about process that generated that data
 Find canonical model of position by statistics
 Use interface to make predictions about previously
unseen data: Test data
e.g., predict position of new players.
Variations of Paradigm:
 Supervised Learning
 Learn an input to output map
• Classification: categorical output
• Regression: continuous output
Unsupervised Learning:
 Discover patterns in the data
• Clustering: cohesive grouping
• Association: Frequent cooccurance
 Reinforcement Learning
 Learning control
Machine Learning Tasks:
Task Measure
 Classification Classification error
 Regression Prediction error
 Clustering scatter/purity
 Associations Support/confidence
Challenges:
 How good is a model?
 How do I choose a model?
 Is the data of sufficient quality?
 Errors in data. E.g., Age=225; noise in low resolution images
 Missing vales
Contd…
 How confident can I be of the results?
 Am I describing the data correctly?
 Are age and income enough? Should I look at Gender also?
 How should I represent age? As a number or as young,
middle age, old?
Supervised Learning:
Labeled Training Data
Classification Possible
Classifiers
Supervised Learning:
Inductive Bias:
 Need to generalize
Assumption about lines
Encoding/Normalization of data
In general Inductive Bias: X1=
<0.15,0.25>,Y1= -1
 Language Bias and Search Bias X2=
<0.4,0.45>,Y2= +1
Process of supervised learning:
Training data X1=<30000,25>,Y1= doesnotbuycomputer
X2=<80000,45>,Y2= buycomputer
X1, Y1
X2, Y2
X3, Y3
-------
-------
Training Algorithm
X1, Y1
X2, Y2
X3, Y3
Testing
data
Classifier
Validiation
Supervised Learning:
Applications:
1. Credit Card Fraud detection (valid transaction or
not)
2. Sentiment Analysis (Opinion mining, buzz
analysis etc)
3. Churn Prediction (Potential Churner or not?)
4. Medical Diagnoses (Risk Analysis)
Prediction or Regression:
Regression:
Applications:
1.Time series Predictions
Rain Fall in a certain region, spend on
voice call
2.Classifications
3. Data Reduction
4. Trend Analysis
5. Risk Factor Analysis
Examples of Classifying and Clustering:
Unlabeled Data:
Clustering examples into group:
Similarity based on weight:
Similarity based on Height:
Cluster into two groups using both attributes:
Labeled Data:
Finding Classifier surface:
 Given Labeled groups into feature space, want to
find a surface
in that space that separates the groups.
 Subject to constraints on the complexity of
subsurface
 In this example, have 2 D space, so find line(or
connected set of line segments) that best
separates the two groups.
 When the examples get separated, it is straight
forward.
 When examples in labeled groups overlap, may
have to trade of false positives and false negative.
Labeled Data:
Adding some new data:
 Suppose we have learned to separate receivers
versus linemen
 Now we are given some running backs, and want to
use model to decide if they are more like receiver or
linemen.
Blount = [‘blount’, 75, 250]
white = [ ‘white’, 70, 205]
Clustering using Unlabeled data:
Classified using Labeled data:
Machine Learning Methods:
Requirements for Methods:
 Choosing training data and evaluation method.
 Representation of the features
 Distance matrix for feature vectors
 Objective function and constraints
 Optimization method for learning the model
Feature Representation:
An Example:
Contd…
Need to measure distance between Features
Feature Engineering:
 Deciding which features to include and which
are merely adding noise to classifier.
 Defining how to measure distances between
training examples (and ultimately between
classifiers and new instances)
 Deciding how to weight relative importance of
different dimensions of features vector, which
impacts defination of distance.
Measuring distance between animals:
 We can think of our animal examples as consisting
of four binary features and one integer features.
 One way to learn to separate Reptiles from non-
reptiles is to measure the distance between pairs of
examples and use that:
 To cluster nearby examples into a common class
(Unlabeled) data or
 To find a classifier surface in space of examples
that optimally separates different (Labeled)
collections of examples from ther collections.
Minkowski Matric:
Euclidean Distance between animals:
Euclidean Distance between animals:
Using Euclidean Distance rattlesnake and boa
constrictor are much closer to each other , than they are
to the dart frog.
Add an Alligator
Alligator is closer to dart frog than snake-why?
 Alligator differs from frog in 3 features, from boa in
only 2 features.
 But scale on “legs” is from 0 to 4, on other features
is 0 to 1
 “legs” dimension is disproportionately large
Using Binary Features:
Now alligator is closer to snakes than it is to
frog
- Make sense
Clustering Approach:
 Suppose we know that there are k different groups
in our training data, but we don’t know labels.
 Pick K samples (at random?) as exemplars
 Cluster remaining samples by minimizing distance
between samples in same cluster(objective function)
put samples in group with closest exemplars.
 Find median example in each cluster as new
exemplars
 Repeat until no change
Issues:
 How do we decide on the best number of clusters?
 How do we select the best features, the best
distance metric?
Classification Approach:
 Want to find boundaries in feature space that
separate different classes of labeled examples.
 Look for simple surface (e.g., best line or plane)
that separates classes.
 Look for more complex surface(subject to
constraints) that separate classes.
 Use voting schemes.
use k nearest training examples, use majority vote
to select label.
Issues:
 How do we avoid over-fitting to data?
 How do we measure performance?
 How do we select best features?
Classification:
 Attempt to minimize error on training data
 Similar to fitting a curve to data
Evaluate on training data.
Randomly Divide Data into Training and Test set:
Two possible models for a Training Set
Confusion Matrices (Training Error)
Training Accuracy of Model:
 0.7 for both models?
Which is better?
 Can we find a model with less training error? Yes.
Applying Model to test Data:
Other statistical Measures:
Summery:
 Machine learning methods provide a way of
building models of processes from data sets
 Supervised Learning uses labeled and create
classifiers that optimally separate data into known
classes.
 Unsupervised Learning tries to infer latent
variables by clustering training examples into
nearby groups.
 Choice of features influences results.
 Choice of distance measurement between
examples influences results.
 We will see some clustering methods like k-
means.

Mais conteúdo relacionado

Mais procurados

Chapter 09 class advanced
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advancedHouw Liong The
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.butest
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...butest
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTrilok Sharma
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningKmPooja4
 
Bag the model with bagging
Bag the model with baggingBag the model with bagging
Bag the model with baggingChode Amarnath
 
ensemble learning
ensemble learningensemble learning
ensemble learningbutest
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clusteringguest0edcaf
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
Learning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learningLearning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learningSimon John
 
Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysisAnimesh Kumar
 

Mais procurados (20)

Chapter 09 class advanced
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advanced
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
 
Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Multiple Classifier Systems
Multiple Classifier SystemsMultiple Classifier Systems
Multiple Classifier Systems
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Bag the model with bagging
Bag the model with baggingBag the model with bagging
Bag the model with bagging
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
 
ensemble learning
ensemble learningensemble learning
ensemble learning
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
 
Lect4
Lect4Lect4
Lect4
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
Learning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learningLearning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learning
 
Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysis
 

Semelhante a Lecture 09(introduction to machine learning)

slides
slidesslides
slidesbutest
 
slides
slidesslides
slidesbutest
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)butest
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.butest
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learningTonmoy Bhagawati
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxYousef Aburawi
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updatedVajira Thambawita
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxVenkateswaraBabuRavi
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion antimo musone
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Chapter01.ppt
Chapter01.pptChapter01.ppt
Chapter01.pptbutest
 

Semelhante a Lecture 09(introduction to machine learning) (20)

slides
slidesslides
slides
 
slides
slidesslides
slides
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
Introduction
IntroductionIntroduction
Introduction
 
module 6 (1).ppt
module 6 (1).pptmodule 6 (1).ppt
module 6 (1).ppt
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptx
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Chapter01.ppt
Chapter01.pptChapter01.ppt
Chapter01.ppt
 

Mais de Jeet Das

Lecture 13
Lecture 13Lecture 13
Lecture 13Jeet Das
 
Lecture 12
Lecture 12Lecture 12
Lecture 12Jeet Das
 
Lecture 10
Lecture 10Lecture 10
Lecture 10Jeet Das
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08 Jeet Das
 
Information Retrieval 02
Information Retrieval 02Information Retrieval 02
Information Retrieval 02Jeet Das
 
Information Retrieval 07
Information Retrieval 07Information Retrieval 07
Information Retrieval 07Jeet Das
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06Jeet Das
 
Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)Jeet Das
 
Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)Jeet Das
 
Information Retrieval-1
Information Retrieval-1Information Retrieval-1
Information Retrieval-1Jeet Das
 
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali TokenizerJeet Das
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technologyJeet Das
 

Mais de Jeet Das (13)

Lecture 13
Lecture 13Lecture 13
Lecture 13
 
Lecture 12
Lecture 12Lecture 12
Lecture 12
 
Lecture 10
Lecture 10Lecture 10
Lecture 10
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08
 
Information Retrieval 02
Information Retrieval 02Information Retrieval 02
Information Retrieval 02
 
Information Retrieval 07
Information Retrieval 07Information Retrieval 07
Information Retrieval 07
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06
 
Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)
 
Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)
 
Information Retrieval-1
Information Retrieval-1Information Retrieval-1
Information Retrieval-1
 
NLP
NLPNLP
NLP
 
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali Tokenizer
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
 

Último

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 

Último (20)

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 

Lecture 09(introduction to machine learning)

  • 1. Lecture 01 Introduction to Machine Learning
  • 2. What is Machine Learning?  All useful Programs “Learn” something.  Early definition of Machine Learning “ Field of study that gives computers the ability to learn without being explicitly programmed” Arthur Samuel (1959).  Computer pioneer who wrote first self-learning program, which played checkers- learned from experience.  Tom Mitchell(1997)- “Said to learn from experience with respect to some class of tasks, and a performance measure P, if[ the learner’s] performance at tasks in the class as measured by P, improves with experience”
  • 3. Contd… Traditional Programming: Machine Learning: Machine Learning is the central part of some other courses like Natural Language Processing, Computational Biology, Computer Vision, robotics etc. Computer Data Output Program Computer Data Program Output
  • 4. How are things learned?  Memorization:  Accumulation of individual facts Limited by- Time to observe facts Memory to store facts  Generalization:  Deduce new facts from old facts Limited by accuracy of deduction process  Essentially a predictive activity  Assumes that past predicts the future  Interested in extending to programs that can infer useful information from implicit patterns in data. Declarative Knowledge Imperative Knowledge
  • 5. Basic Paradigm:  Observe set of examples: Training data e.g., Football Players, labeled by position with height and weight data.  Infer something about process that generated that data  Find canonical model of position by statistics  Use interface to make predictions about previously unseen data: Test data e.g., predict position of new players.
  • 6. Variations of Paradigm:  Supervised Learning  Learn an input to output map • Classification: categorical output • Regression: continuous output Unsupervised Learning:  Discover patterns in the data • Clustering: cohesive grouping • Association: Frequent cooccurance  Reinforcement Learning  Learning control
  • 7. Machine Learning Tasks: Task Measure  Classification Classification error  Regression Prediction error  Clustering scatter/purity  Associations Support/confidence Challenges:  How good is a model?  How do I choose a model?  Is the data of sufficient quality?  Errors in data. E.g., Age=225; noise in low resolution images  Missing vales
  • 8. Contd…  How confident can I be of the results?  Am I describing the data correctly?  Are age and income enough? Should I look at Gender also?  How should I represent age? As a number or as young, middle age, old?
  • 9. Supervised Learning: Labeled Training Data Classification Possible Classifiers
  • 11. Inductive Bias:  Need to generalize Assumption about lines Encoding/Normalization of data In general Inductive Bias: X1= <0.15,0.25>,Y1= -1  Language Bias and Search Bias X2= <0.4,0.45>,Y2= +1 Process of supervised learning: Training data X1=<30000,25>,Y1= doesnotbuycomputer X2=<80000,45>,Y2= buycomputer X1, Y1 X2, Y2 X3, Y3 ------- ------- Training Algorithm X1, Y1 X2, Y2 X3, Y3 Testing data Classifier Validiation
  • 12. Supervised Learning: Applications: 1. Credit Card Fraud detection (valid transaction or not) 2. Sentiment Analysis (Opinion mining, buzz analysis etc) 3. Churn Prediction (Potential Churner or not?) 4. Medical Diagnoses (Risk Analysis)
  • 14. Regression: Applications: 1.Time series Predictions Rain Fall in a certain region, spend on voice call 2.Classifications 3. Data Reduction 4. Trend Analysis 5. Risk Factor Analysis
  • 15. Examples of Classifying and Clustering:
  • 20. Cluster into two groups using both attributes:
  • 22. Finding Classifier surface:  Given Labeled groups into feature space, want to find a surface in that space that separates the groups.  Subject to constraints on the complexity of subsurface  In this example, have 2 D space, so find line(or connected set of line segments) that best separates the two groups.  When the examples get separated, it is straight forward.  When examples in labeled groups overlap, may have to trade of false positives and false negative.
  • 24. Adding some new data:  Suppose we have learned to separate receivers versus linemen  Now we are given some running backs, and want to use model to decide if they are more like receiver or linemen. Blount = [‘blount’, 75, 250] white = [ ‘white’, 70, 205]
  • 28. Requirements for Methods:  Choosing training data and evaluation method.  Representation of the features  Distance matrix for feature vectors  Objective function and constraints  Optimization method for learning the model
  • 32.
  • 33.
  • 34. Need to measure distance between Features Feature Engineering:  Deciding which features to include and which are merely adding noise to classifier.  Defining how to measure distances between training examples (and ultimately between classifiers and new instances)  Deciding how to weight relative importance of different dimensions of features vector, which impacts defination of distance.
  • 35. Measuring distance between animals:  We can think of our animal examples as consisting of four binary features and one integer features.  One way to learn to separate Reptiles from non- reptiles is to measure the distance between pairs of examples and use that:  To cluster nearby examples into a common class (Unlabeled) data or  To find a classifier surface in space of examples that optimally separates different (Labeled) collections of examples from ther collections.
  • 38. Euclidean Distance between animals: Using Euclidean Distance rattlesnake and boa constrictor are much closer to each other , than they are to the dart frog.
  • 39. Add an Alligator Alligator is closer to dart frog than snake-why?  Alligator differs from frog in 3 features, from boa in only 2 features.  But scale on “legs” is from 0 to 4, on other features is 0 to 1  “legs” dimension is disproportionately large
  • 40. Using Binary Features: Now alligator is closer to snakes than it is to frog - Make sense
  • 41. Clustering Approach:  Suppose we know that there are k different groups in our training data, but we don’t know labels.  Pick K samples (at random?) as exemplars  Cluster remaining samples by minimizing distance between samples in same cluster(objective function) put samples in group with closest exemplars.  Find median example in each cluster as new exemplars  Repeat until no change Issues:  How do we decide on the best number of clusters?  How do we select the best features, the best distance metric?
  • 42. Classification Approach:  Want to find boundaries in feature space that separate different classes of labeled examples.  Look for simple surface (e.g., best line or plane) that separates classes.  Look for more complex surface(subject to constraints) that separate classes.  Use voting schemes. use k nearest training examples, use majority vote to select label. Issues:  How do we avoid over-fitting to data?  How do we measure performance?  How do we select best features?
  • 43. Classification:  Attempt to minimize error on training data  Similar to fitting a curve to data Evaluate on training data.
  • 44. Randomly Divide Data into Training and Test set:
  • 45. Two possible models for a Training Set
  • 47. Training Accuracy of Model:  0.7 for both models? Which is better?  Can we find a model with less training error? Yes.
  • 48. Applying Model to test Data:
  • 50. Summery:  Machine learning methods provide a way of building models of processes from data sets  Supervised Learning uses labeled and create classifiers that optimally separate data into known classes.  Unsupervised Learning tries to infer latent variables by clustering training examples into nearby groups.  Choice of features influences results.  Choice of distance measurement between examples influences results.  We will see some clustering methods like k- means.