1. Lecture #1: Introduction to machine learning (ML)
Hardware speed and capability increases at a faster rate to software. The gap is increasing daily.
Programs still need to be crafted (handmade) by programmers.
Since the 1950s, computer scientists have tried to give computers the ability to learn.
ML (Mitchell) – Subfield of AI concerned with computer programs that learn from experience.
ML is building computer programs that improve its performance (its learning) of doing some task using
observed data or past experience.
An ML program (learner) tries to learn from the observed data (examples) and generates a model that
could respond (predict) to future data or describe the data seen.
A model is then, a structure that represents or summarizes some data.
Example: ML program gets a set of patient cases with their diagnoses. The program will either:
Predict a disease present in future patients, or
Describe the relationship between diseases and symptoms
ML is like searching a very large space of hypotheses to find the one that best fits the observed data, and
that can generalize well with observations outside the training set.
Great goal of ML
Tell the computer what task we want it to perform and make it to learn to perform that task efficiently.
ML: emphasis on learning, different than expert systems: emphasis on expert knowledge
Expert systems don't learn from experiences
They encode expert knowledge about ho they make particular kinds of decisions.
ML is an interdisciplinary field using principles and concepts from Statistics, Computer Science, Applied
Mathematics, Cognitive Science, Engineering, Economics, Neuroscience
ML includes algorithms and techniques found in Data Mining, Pattern Recognition, Neural Networks …
ML: When?
When expertise does not exist (navigating on Mars)
Solution cannot be expressed bu a deterministic equation (face recognition)
Solution changes in time (routing on a computer network)
Solution needs to be adapted to particular cases (user biometrics)
ML: Applications
Medicine diagnosis
Market basket analysis
Image/text retrieval
Automatic speech recognition
Object, face, or hand writing recognition
Financial prediction
Bioinformatics (e.g., protein structure prediction)
Robotics
Types of Learning
Supervised learning
It occurs when the observed data includes the correct or expected output. Learning is called:
Detection if output is binary (Y/N, 0/1, True/False).
Example: Fraud detection
Classification if output is one of several classes (e.g., output is either low, medium, or high).
Example: Credit Scoring
1
2. Two classes of customers asking for a loan: low-risk and high-risk.
Input features are their income and savings.
Classifier using discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk
Finding the right values for θ1 and θ2 is part of learning
Other classifiers use a density estimation functions (instead of finding a discriminant) where each
class is represented by a probability density function (e.g., Gaussian)
Several classification applications: Face recognition, character recognition, speech recognition,
medical diagnosis ...
Regression if output is a real value.
Example: Determining the price of a car
x: car attributes, y: price (y = wx+w0)
Finding the right values for w parameters and regression model (e.g., linear, quadratic) is learning.
y = wx+w0
Unsupervised learning
When the correct output is not given with the observed data.
2
3. ML tries to learn relations or patterns in the data components (also called as attributes or features)
ML program can group the observed data into classes and assign the observations to these classes.
Learning is called clustering.
Finding the right number of classes and their centers or discriminant is learning.
Clustering is used in customer segmentation in CRM, in learning motifs (sequence of aminoacids that
occur in proteins) in Bioinformatics
Other types of unsupervised learners will be introduced later.
Reinforcement learning
When the correct output is a sequence of actions, and not a single action or output.
The model produces actions and rewards (or punishments). The goal is to find a set of actions that
maximizes rewards (and minimizes punishments).
Example: Game playing where a single move by itself is not important. ML evaluates a sequence of
moves and determine whether how good is the game playing policy.
Other applications: robot navigation,
Concept learning
Learn to predict the value of some concept (e.g., playing some sport) given values of some attributes (e.g.,
temperature, humidity, wind speed, sky outlook) for some past observations or examples.
Values of a past example: outlook=sunny, temperature=hot, humidity=high, windy=false, play = NO
Other types of learning: concept learning, instance-based learning, explanation-based learning, bayesian
learning, case-based learning, statistical learning
Generalization
Machine learner uses a collection of observations (called training set) for learning
Good generalization requires the reduction of error during the evaluation of a learner using a testing set
Avoid model overfitting that happens when the training error is low and the generalization error is high.
For example, in regression, you can find a polynomial of order n-1 that fits exact n points.
Training error: 0
It does mean that the model will perform well with unseen data.
Learning process
Learning is a process that consists of:
1. Data selection
Data may need to be cleaned and preprocessed.
2. Feature selection
3
4. Size (dimensionality) of the data can be big
Document classification may have too many words.
Use features that are easier to extract and less sensitive to noise.
Divide the dataset into a training dataset and a testing dataset.
3. Model selection
A lot of guessing here. Select the model (or model set) and error function.
Select the simplest model first, then select another class of model.
Avoid overfitting
4. Learning
Train the learner or model.
Find the parameter values by minimizing the error function.
5. Evaluation
Learner is evaluated on the testing dataset.
You may need to select another model, or switch to a different set of features.
6. Application
Apply the learning model.
For example, perform prediction with new, unseen data using learned model.
In this course you will learn the following:
Different learning problems and their solutions
Choose a right model for a learning problem
Finding good parameter values for the models
Selecting good features or parameters as input to a model
The evaluation of a machine learner
4