Context
1. Housing Agent collected resale prices on HDB apartments in Singapore.
Objective
2. To predict resale prices in to advise his potential clients.
Strategies
3. Explore & Clean data for analysis.
4. Perform K-Means Clustering, in Orange, to find possible segments in the customer data.
5. Tune the model to improve its performance.
6. Visualise the findings, share conclusions, and give insight-driven recommendations.
Author: Anthony mok Date: 18 Nov 2023
Email: xxiaohao@yahoo.com
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
1. Linear
Regression
With Orange
A U T H O R : A N T H O N Y M O K
D AT E : 1 8 N O V 2 0 2 3
E M A I L : X X I A O H A O @ YA H O O . C O M
Predicting HDB
Resale Prices
2. What Do Users Say About Orange
"ORANGE IS AN EXCELLENT DATA MINING
TOOL THAT IS EASY TO USE AND HAS A
WIDE RANGE OF FEATURES.".
"I LOVE ORANGE'S VISUAL PROGRAMMING
INTERFACE. IT MAKES IT EASY TO BUILD
COMPLEX DATA MINING WORKFLOWS."
"ORANGE IS A POWERFUL TOOL THAT CAN
BE USED TO SOLVE A WIDE VARIETY OF
PROBLEMS."
Orange is an open-source data visualisation and machine learning toolkit that has been
widely praised:
3. Project’s Context, Objective & Strategies
Objective
To predict resale prices in to
advise his potential clients
Strategies
Explore & Clean data for analysis
Perform K-Means Clustering, in
Orange, to find possible segments
in the customer data
Tune the model to improve its
performance
Visualise the findings, share
conclusions, and give insight-
driven recommendations
Context
Housing Agent collected resale prices
on HDB apartments in Singapore
6. Loading File & Exploring Data
Loading File
hdb_resales.csv file was imported
into workflow. The ‘Role’ for
‘resales_price’ was set as
‘target’, with the rest set as ‘feature’
Exploring Data
No missing data found
7. Looking for Relationships & Patterns*
Floor Space & Resale Prices
Prices Increase as Floor Area Increases
Region & Resale Prices
More 4 & 5-room Apartments in the
Central Region commanding Resales Prices
above 500K than in other regions
* More comprehensive findings and conclusions were provided in the project report, which are
not released at the request of the Housing Agent
8. Splitting Data & Doing Linear Regression
Splitting Data
Dataset was split into 70%
for training and 30% for
testing the Linear
Regression Model
Conduct Linear Regression
Testing and training data was fitted
into the Linear Regression Model
Coefficients*
Mix of positive/negative
correlations between resale
prices and the towns these
apartments belong to
* More comprehensive findings and conclusions were provided in
the project report, which are not released at the request of the
Housing Agent
9. Evaluating Performance of Model
Evaluation (Training Data)
Performance of Model based
on 70% for training data:
RMSE = 53,981
R2 = 82.3%
Evaluation (Testing Data)
Performance of Model based on
30% of testing data
RMSE = 61992
R2 = 78.3%
Findings, Conclusions & Recommendations
With an R-squared (R2) value of 82.3% on the training data and 78.3% on
the testing data, the model is able to explain a significant portion of the
variation in the dependent variable, which suggests it is performing well.
At 53,981 for the training data and 61,992 for the testing data, the Root
Mean Squared Error (RMSE) is also relatively low. This suggests that the
model's predictions are fairly close to the actual values of the dependent
variable.
Overall, these scores inform that this model is fit-for-use for prediction
without the need for regularisation, since the model is not overfitted to the
data.
10. Making Predictions With Model*
Predicting HDB Apartment
Resale Prices
Snapshot of predicted HDB
Apartment Resale Prices
Predicting Resales Prices
The number of rooms in an apartment is a relatively
good predictor of its resale price. The Housing Agent
can use this model to predict the resale price of 3-
room, 4-room, and 5-room apartments with a high
degree of accuracy
* More comprehensive findings and conclusions were provided in the project report, which are not released at the request of the Housing Agent
11. Linear
Regression
With Orange
A U T H O R : A N T H O N Y M O K
D AT E : 1 8 N O V 2 0 2 3
E M A I L : X X I A O H A O @ YA H O O . C O M
Predicting HDB
Resale Prices