Strategies for Landing an Oracle DBA Job as a Fresher
How to understand and implement regression analysis
1. How to Understand and Implement Regression Analysis
artificiallyintelligentclaire.com/regression-analysis
They say a picture speaks 1000 words. So to summarize regression analysis in machine
learning, I have created an infographic.
The regression algorithms infographic is designed as a quick reminder of the basics of
each algorithm.
Before I share it with you, however, I will give an overview of regression analysis.
When is regression analysis used in machine learning?
Regression algorithms are used in supervised machine learning on continuous data.
When we talk about supervised learning in ML, what we mean is that we have a set of
training data for the algorithm to learn from. This training data contains all the inputs as
well as the output value of an actual incident in the data.
For example, the number of rooms a house has (input) and the price of the house (output).
This training data is used to teach the machine how the number of rooms and price are
related, allowing it to make predictions of the output, cost of a house, based on the inputs,
number of rooms.
Continuous data is where the predicted output could, in theory, hold any numeric value. For
example, house price data is classed as continuous.
Non-continuous data, also known as discrete data, for comparison is data derived from a
question with a yes or no answer, i.e. is this item a chair? Yes/No. This is a classification
problem, not a regression problem.
Regression algorithms allow you to predict with statistical significance:
Impact of different variables on the outputs
The output of a set of data inputs
Now let’s take a deeper look at regression algorithms
What are the different types of regression analysis?
There are many different types of analysis you can run using regression algorithms. These
are shared in the infographic below, but I have also summarized them here.
Linear Regression: Compares the relationship between two variables with a linear
relationship, i.e. as one increases so does the other. It creates a line that tries to
minimize the distance between each data point and the line. It is also known as the
ordinary least squares model.
1/5
2. Multivariable Regression: Similar to linear regression, however, you evaluate multiple
input variables. Also, it does not always have to be a linear relationship.
Polynomial Regression: Compares the relationship between two variables with a
non-linear relationship, i.e. cubic relationship
SVR: Support Vector Regression keeps all predictions within a certain threshold, or
vector, of the actual values.
POPULAR PRODUCTS
Decision Tree: Decision tree regression splits the data into discrete sections that are
arrived at following a set of binary classification decisions. A prediction is then made
taking the average value of all data in the section where the new data point lands
Random Forest: Random forest regression uses the same process as decision tree
regression above but creates multiple decision trees and then makes a prediction for
your data point based on the average projection of all trees created for the forest.
For more details, I recommend Wikipedia as a great resource (link)
Now for the fun stuff – implementing regression analysis!
How do you implement regression algorithms using python?
In this section, I have provided links to the documentation in Scikit-Learn for implementing
regression.
Before you do any type of data analysis using regression algorithms however you need to
clean your data.
This process is called data pre-processing and is essential for ensuring you get a good
output from your algorithm.
Some steps to follow are:
Check for outliers in the data that could skew the results
Replace missing data points with the average value for that data point (this is one
option generally seen as better than removing that data point entirely)
Feature scaling: If you have input variables on very different scales, you may need to
scale them to ensure you don’t get impact bias for one variable
The Youtube tutorial videos #37, 38 and 39 cover some techniques to do this here ( link)
Implementing regression algorithms in python using the Scikit-Learn module:
2/5
3. Want to learn more about machine learning in python? Check out my previous post on
choosing a machine learning in python course.
Then depending on the type of regression algorithm you need below are links to the
documentation:
Linear
Multivariable – uses the same module as linear regression
Polynomial
SVR
Decision Tree
Random Forest
About the Author:
My name is Claire and I help inquisitive millennials who love to learn about tech and artificial
intelligence by blogging about learning to code and innovations in AI. I also talk about my
journey through blogging to find greater well being in this highly stressful environment.
Burnout in tech is no joke but through my blog, I want to help others find strategies that help
them with mental well being.
You can learn more about me on my blog – Artificially Intelligent Claire
Regression Algorithms Infographic
3/5