SlideShare uma empresa Scribd logo
1 de 51
Slide 1 of 52
Economic and Financial Forecasting
Using Machine Learning Methodologies
Periklis Gogas
Theophilos Papadimitriou
Slide 2 of 52
The framework
• Research Grant “Thales” – awarded to T.
Papadimitriou
• Started
• Concludes November 2015
• Dissemination of research results
Slide 3 of 52
Presentation Outline
• Intuition for using Machine Learning in Forecasting
• Forecasting applications:
• Exchange rate forecasting
• Forecasting House Prices in the U.S.
• Forecasting Bank Failures and stress testing in the U.S.
• Forecasting Recession
Slide 4 of 52
Why Machine Learning?
• Renewed interest in machine learning
• Higher volumes and varieties of available data
• Affordable data storage.
• More importantly: computer processing is cheaper
and more powerful
Slide 5 of 52
Application 1: Forecasting Exchange Rates
• Driven by microeconomic factors:
Markets, demand & supply
High
frequency
(daily)
• Driven by Macroeconomics: e.g.
Monetary Exchange Rate models
Lower
frequency
(monthly)
Journal of Forecasting, 2015, vol. 34 (7), pp. 560-573
• 2 exchange rate frequencies
• Theory: different data generating processes
• Is this confirmed?
Slide 6 of 52
Methodologies
Machine Learning Econometrics
• ARCH
• GARCH
• EGARCH
• AR
• ARMA
• ARIMA
• AFRIMA
• Random Walk
• Artificial Neural
Networks
• Support Vector
Regression
Slide 7 of 52
Overview: Hybrid Method
Slide 8 of 52
Decomposition: Ensemble Empirical Mode
Decomposition
Slide 9 of 52
Variable Selection: Multivariate Adaptive
Regression Splines (MARS)
knot
• Breaks dataset into subsamples
• Identifying observations called knots
• Assigns weights to variables
• Selects the most informative
Slide 10 of 52
+ε
-ε
0
ζ
Forecasting: Support Vector Regression
• Fit error tolerance band
• Higher flexibility than fitting a simple line
• Defined according to subset of observations called support
vectors
Slide 11 of 52
The Kernel Projection
K(x1,x2)
• We fit a linear model
• Real phenomena usually non-linear
• Project to higher dimensions
• Find a dimensional space where a linear error tolerance band is
defined
• Re-project to original space and obtain non-linear error tolerance
band
Slide 12 of 52
Kernels Used
Linear
RBF
Polynomial
Sigmoid
2121 ),( xxxxK T

2
21
),( 21
xx
exxK



dT
rxxxxK )(),( 2121  
)tanh(),( 2121 rxxxxK
T
 
• Separate data
• Train data: used to obtain the optimum model
• Test data: never seen by the optimum model, used for out-of-
sample forecasting
Train data
Test data
Building the Model: Split the dataset into
two parts
Overfitting and Cross-Validation
• The problem of overfitting:
• High accuracy in a specific sample
• Low generalization ability
• Solution: use k-part Cross-Validation
• Example of a 3-part CV
Training Testing
Model
evaluation
model
Training Testing
model
Training Testing
model
Initial
Dataset
Overfitting and Cross-Validation
Slide 16 of 52
Forecasting Exchange Rates
The data used:
•5 rates of various trading volumes:
•2 high volume rates USD/EUR, USD/JPY
•1 medium volume rate NOK/AUD
•2 low volume rates ZAR/ PHP and
NZD/BRL
Slide 17 of 52
Commodities
Crude Oil
Cotton
Lumber
Cocoa
Coffee
Orange Juice
Sugar
Corn
Wheat
Oats
Rough Rice
Soybean Meal
Soybean Oil
Soybeans
Feeder Cattle
Lean Hogs
Live Cattle
Pork Bellies
Iron Ore
Metals
Gold
Copper
Palladium
Platinum
Silver
Aluminum
Zinc
Nickel
Lead
Tin
Stock Indices
Dow Jones
Nasdaq 100
S & P 500
DAX
CAC 40
FTSE 100
Nikkei 225
Interest rates
T-bill 6 months
T-bill 10 years
Spread MLP-EURIBOR 3M
Spread MLR-Eonia
Spread FF-CP
Spread FF-EFF
EONIA
EURIBOR 1 Week
ECB Interest rate
EURIBOR 1 Month
FED rate
Technical Analysis
variables
Five Day Index
Moving Average 3 day
Moving Average 5 day
Moving Average 10 day
Moving Average 30 day
Macroeconomic
Vars all countries
CPI
Productivity
index
GDP
Trade Balance
Unemployment
Central Bank
Discount rate
Long Term
Interest Rates
Short Term
Interest Rate
Aggregate money
M1, M2
Public Debt
Deficit/Surplus of
Government
Budget
Exchange Rates
JPY/EUR
JPY/USD
USD/GBP
BRL/NZD
NOK/AUD
PHP/ZAR
EUR/GBP
EUR/USD
USD Trade
Weighted
Indices
Major partners
Broad Index
Other Partners
Input Variables (192)
Slide 18 of 52
Forecasting Exchange Rates
EEMD smoothed component vs USD/EUR series
Slide 19 of 52
0%
2%
4%
6%
8%
10%
12%
14%
16%
EUR/USD USD/JPY AUD/NOK
MeanAbsolutePercentageError
RW
AR-SVR
EEMD-AR-SVR
EEMD-MARS-SVR
EEMD-AR-ANN
EEMD-MARS-ANN
ARIMA
GARCH
Out-of-sample forecasting results
Daily frequency
Autoregressive models, no-fundamentals
Slide 20 of 52
0%
1%
2%
3%
4%
5%
6%
ZND/BRL ZAR/PHP
MeanAbsolutePercentageError
RW
AR-SVR
EEMD-AR-SVR
EEMD-MARS-SVR
EEMD-AR-ANN
EEMD-MARS-ANN
ARIMA
GARCH
Autoregressive models, no-fundamentals
Out-of-sample forecasting results
Daily frequency
Slide 21 of 52
0.0%
0.1%
0.2%
0.3%
0.4%
0.5%
0.6%
0.7%
0.8%
0.9%
1.0%
EUR/USD USD/JPY AUD/NOK
MeanAbsolutePercentageError
RW
AR-SVR
EEMD-AR-SVR
EEMD-MARS-SVR
EEMD-AR-ANN
EEMD-MARS-ANN
ARIMA
GARCH
Structural Models: Fundamentals Included
Out-of-sample forecasting results
Monthly frequency
Slide 22 of 52
0.0%
0.1%
0.2%
0.3%
0.4%
0.5%
0.6%
0.7%
0.8%
0.9%
ZND/BRL ZAR/PHP
MeanAbsolutPercentageError
RW
AR-SVR
EEMD-AR-SVR
EEMD-MARS-SVR
EEMD-AR-ANN
EEMD-MARS-ANN
ARIMA
GARCH
Structural Models: Fundamentals Included
Out-of-sample forecasting results
Monthly frequency
Slide 23 of 52
Conclusions
 Best model outperforms the RW model
 ML: outperforms all econometric methodologies
 Rejection even of the weak form of efficiency
 The model captures the different data generating
processes that drive exchange rates in short and long
run
Slide 24 of 52
• Forecast direction: UP or DOWN
• Frequency: Daily
• Rates: USD/EUR, USD/JPY, USD/GBP and USD/AUD
• Data span: January 2, 2013 to December 26, 2013
• Training 200
• Test: 51
Application 2: Forecasting Exchange
Rates Directionally
Algorithmic Finance, vol. 4 (1-2), pp. 69-79.
Slide 25 of 52
• Order Flow Analysis
• Problem: data availability
• Alternative: www.Stocktwits.com
• Investor sentiment
Application 2: Forecasting Exchange
Rates Directionally
Slide 26 of 52
Application 2: Forecasting Exchange
Rates Directionally
Slide 27 of 52
Past values of the exchange rate
The volume of “Bearish” and “Bullish” posts per day
The volume of “Bearish”, “Bullish” and total posts per day
Past values of the exchange rate and the volumes of “Bullish” and
“Bearish” posts per day
Past values of the exchange rate, volumes of “Bullish”, “Bearish” and
total posts per day
Input Variables Sets
Slide 28 of 52
Application 2: Methodologies
Machine Learning
• Artificial Neural Networks
• Support Vector Machines
• Knn Nearest Neighbors
• Boosted Decision Trees
Econometrics
• Logistic Regression
• Naïve Bayes Classifier
• Random Walk
Slide 29 of 52
Classification: Support Vector Machines
Φ
Φ-1
Projection to n+i dimensions
Slide 31 of 52
Support Vector Machines
Slide 32 of 52
0%
10%
20%
30%
40%
50%
60%
70%
80%
USD/EUR USD/JPY USD/GBP USD/AUD
DirectionalAccuracy
RW
SVM
Naïve Bayes
Knn
Adaboost
Logitboost
Logistic Regression
ANN
Forecasting Exchange Rates
(Short term– microstructural approach - classification)
Slide 33 of 52
Conclusions
• The machine learning methodologies outperform
the RW model
• Machine Learning techniques forecast more
accurately that the econometric methodologies the
out-of-sample direction
• Market hype expressed through the volume (total
number) of posts improves the total forecasting
ability
• Alternative informational path than the order flow
analysis
Slide 34 of 52
Forecasts
• 1-10 years
ahead
• 1890-2012
• 80% - 20%
Input Variables
• Real GDP per capita
• Long and short
term interest rate
• Population number
• Real asset value
• Real construction
cost
• Unemployment /
Inflation
• Real oil Prices
• Fiscal Policy
Indicator
Methodologies
• EEMD – Elastic Net
– SVR
• Bayesian AR/VAR
Application 3: Forecasting the Case &
Schiller house price index
Economic Modelling, 2015, vol. 45, pp. 259-267.
Slide 35 of 52
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Mean Absolute Percentage
Error
Directional Accuracy
RW
RW with drift
BAR
BVAR
AR-SVR
EN-SVR
EEMD-AR-SVR
EEMD-EN-SVR
Optimum Model
Slide 36 of 52
0%
5%
10%
15%
20%
25%
1 2 3 4 5 6 7 8 9 10
MeanAbsolutePercentageError
Forecasting Horizon (Years)
RW
EEMD-AR-SVR
Comparison in Alternative Forecasting
Horizons
EEMD-AR-SVR vs Random Walk
Slide 37 of 52
95
115
135
155
175
195
The EEMD-AR-SVR and the Collapse of the
Housing Market – Actual vs Forecasted
ML model forecasts the sudden drop of house
prices in 2006 that sparked the 2007 financial
crisis.
Slide 38 of 52
Conclusions
• ΕΕΜD-SVR forecasts more accurately than all
models used in literature the evolution of
house prices in out-of-sample forecasting
• The proposed model forecasts almost 2
years ahead the actual 2006 sudden drop in
house prices
Slide 39 of 52
Application 4: Forecasting bank failures
and stress testing
• 1443 U.S. Banks
• 962 solvent
• 481 failed
Data from FDIC (Federal Deposit Insurance Corporation)
Period 2003-2013
The dependent variable
• Financial position of a bank (solvent or insolvent)
The independent variables
• 144 financial variables and ratios for each bank
Slide 40 of 52
Variable selection:
Local-learning for high-dimensional data (Sun et al. in
2010)
Selected variables:
• Tier 1 (core) risk-based capital/total assets year t-1
• Provision for loan and lease losses/total interest
income year t-1
• Loan loss allowance/total assets year t-1
• Total interest expense/total interest income year t-1
• Equity capital to assets year t-1
Slide 41 of 52
Forecasting bank failures and stress testing
96.67%
89.90%
78.13%
50%
60%
70%
80%
90%
100%
t-1 t-2 t-3
Out-of-Sample accuracy
• 3 forecasting windows
Slide 42 of 52
Optimum model:
Year t-1
Real Solvent
228
Real Failed
115
Predicted Solvent 223 3
Predicted Failed 5 112
Accuracy 97.39% 97.81%
Forecasting accuracy
Solvent Misclassified as failed 5
• 4 out of 5 received enforcement action from FDIC
• 1 received financial help from the U.S. Treasury as part of the
Capital Purchase Program under the Troubled Asset Relief
Program
Failed misclassified as solvent 3
• 1 discovered with unreported losses and was closed
• 2 small banks with total assets $44.5 & $26.3 million respectively
Slide 43 of 52
Solvency elasticity – Stress testing
Insolvent Bank Space
Solvent Bank Space
Bank 1
Bank 2
Bank 3
Variable 1
• Banks are mapped on feature space
• Distance from the separating hyperplane provides the sensitivity of its
state
• This property may be used as an auxiliary stress testing methodology
Slide 44 of 52
• The behavior of the yield curve is associated
with the business cycles.
• Indicator of future economic activity.
Application 5: Yield Curve and
Recession Forecasting
• Forecasting: GDP deviations from trend
• positive > Inflationary gaps
• negative > Unemployment gaps
• Literature: Pairs of interest rates, other variables
Application 5: The Yield Curve
and EU Recession Forecasting
International Finance, vol. 18 (2), pp. 207-226
• Data span: September 2009 - October 2013
• Methodologies: SVM vs Probit vs ANN
• Variables: Eurocoin and interest rates: 3 & 6
months and 1, 2, 3, 5, 7, 10, 20 years.
• We tested:
• 27 interest rates in pairs
• 27 interest rates in triplets
• All interest rates
Application 5: The Yield Curve
and EU Recession Forecasting
International Finance, vol. 18 (2), pp. 207-226
Application 5: The Yield Curve
and EU Recession Forecasting
International Finance, vol. 18 (2), pp. 207-226
A
B
A
B
D
C
• Motivation for interest rate triplets: exploit the curvature
• AB same slope
• ACB and ADB different curvature (concave vs convex)
Methodology Kernel Combination
Train accuracy
(%)
Test accuracy
(%)
Growth
accuracy (%)
Recession
Accuracy (%)
Pairs
Probit 2Υ-10Υ 82,00 65,00 57,00 78,00
SVM RBF 2Y-20Y 73,68 78,26 93,00 56,00
SVM RBF 3Y-20Y 72,63 78,26 93,00 56,00
SVM Polynomial 3M-5Y 78,94 69,56 50,00 100,00
SVM Polynomial 1Y-2Y 77,89 69,56 50,00 100,00
ΝN 3M-5Y 79,59 65,21 42,85 100,00
Triplets
Probit 1Υ-5Υ-10Υ 74,00 61,00 43,00 89,00
SVM RBF 1Y-3Y-20Y 73,68 91,30 86,00 100,00
SVM RBF 3M-2Y-20Y 73,68 78,26 64,00 100,00
SVM Polynomial 1Y-3Y-7Y 74,73 65,22 50,00 89,00
SVM Polynomial 3M-5Y-20Y 71,57 65,22 50,00 89,00
ΝN 6M-5Y-7Y 81,63 65,21 57,14 77,78
Allinterestrates
Probit 83,00 65,00 57,00 78,00
SVM Linear 58,94 39,13 0,00 100,00
SVM RBF 74,73 69,56 79,00 56,00
SVM Polynomial 74,73 56,52 50,00 67,00
NN 75,51 69,56 78,57 55,56
Forecasting EU recessions
Slide 53 of 52
Augmenting with Fundamentals
• Australian dollar/euro
• Canadian dollar/euro
• Czech corone-euro
• Chinese yuan-euro
• Dollar-euro
• CPI
• Oil price
• Unemployment
• Inflation
• Μ1
• Μ2
• Μ3
• Industrial Index
• PPI
• External liabilities
• External demand
Slide 54 of 52
73.68
87.36
9091.3 91.3
95.65
0
10
20
30
40
50
60
70
80
90
100
1Y-3Y-20Y Exchange rate austr. $/
euro
Μ1
Train Accuracy (%) Test accuracy (%)
Forecasting EU recessions
Slide 55 of 52
Thank you
Acknowledgments
This research has been co-financed by the European Union
(European Social Fund – ESF) and Greek national funds
through the Operational Program "Education and Lifelong
Learning" of the National Strategic Reference Framework
(NSRF) - Research Funding Program: THALES. Investing in
knowledge society through the European Social Fund.

Mais conteúdo relacionado

Destaque

Jinxing_LIN_S224266_Poster
Jinxing_LIN_S224266_PosterJinxing_LIN_S224266_Poster
Jinxing_LIN_S224266_Poster
jinxing lin
 
Introduction to MARS (1999)
Introduction to MARS (1999)Introduction to MARS (1999)
Introduction to MARS (1999)
Salford Systems
 
thesis_jinxing_lin
thesis_jinxing_linthesis_jinxing_lin
thesis_jinxing_lin
jinxing lin
 
Scope of managerial economics
Scope of managerial economicsScope of managerial economics
Scope of managerial economics
Nethan P
 
Demand Forecasting
Demand ForecastingDemand Forecasting
Demand Forecasting
Anupam Basu
 
Demand forecasting techniques ppt
Demand forecasting techniques pptDemand forecasting techniques ppt
Demand forecasting techniques ppt
pcte
 

Destaque (20)

RSR's Brian Kilcourse Presents The State of Retail Demand Forecasting 2011
RSR's Brian Kilcourse Presents The State of Retail Demand Forecasting 2011RSR's Brian Kilcourse Presents The State of Retail Demand Forecasting 2011
RSR's Brian Kilcourse Presents The State of Retail Demand Forecasting 2011
 
Demand forecasting
Demand forecastingDemand forecasting
Demand forecasting
 
Semiconductor industry demand forecasting using custom models
Semiconductor industry demand forecasting using custom modelsSemiconductor industry demand forecasting using custom models
Semiconductor industry demand forecasting using custom models
 
Data Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsData Science : Make Smarter Business Decisions
Data Science : Make Smarter Business Decisions
 
Scope of managerial economics
Scope of managerial economics Scope of managerial economics
Scope of managerial economics
 
Jinxing_LIN_S224266_Poster
Jinxing_LIN_S224266_PosterJinxing_LIN_S224266_Poster
Jinxing_LIN_S224266_Poster
 
Introduction to MARS (1999)
Introduction to MARS (1999)Introduction to MARS (1999)
Introduction to MARS (1999)
 
Demand Forcasting
Demand ForcastingDemand Forcasting
Demand Forcasting
 
Tokyowebmining41
Tokyowebmining41Tokyowebmining41
Tokyowebmining41
 
thesis_jinxing_lin
thesis_jinxing_linthesis_jinxing_lin
thesis_jinxing_lin
 
Machine learning ~ Forecasting
Machine learning ~ ForecastingMachine learning ~ Forecasting
Machine learning ~ Forecasting
 
Demand Forecasting
Demand ForecastingDemand Forecasting
Demand Forecasting
 
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
 
Forecasting stock market movement direction with support vector machine
Forecasting stock market movement direction with support vector machineForecasting stock market movement direction with support vector machine
Forecasting stock market movement direction with support vector machine
 
機器學習速遊
機器學習速遊機器學習速遊
機器學習速遊
 
Machine Learning in the Cloud: Building a Better Forecast with H20 & Salesforce
Machine Learning in the Cloud: Building a Better Forecast with H20 & SalesforceMachine Learning in the Cloud: Building a Better Forecast with H20 & Salesforce
Machine Learning in the Cloud: Building a Better Forecast with H20 & Salesforce
 
Scope of managerial economics
Scope of managerial economicsScope of managerial economics
Scope of managerial economics
 
Demand Forecasting
Demand ForecastingDemand Forecasting
Demand Forecasting
 
Demand estimation and forecasting
Demand estimation and forecastingDemand estimation and forecasting
Demand estimation and forecasting
 
Demand forecasting techniques ppt
Demand forecasting techniques pptDemand forecasting techniques ppt
Demand forecasting techniques ppt
 

Semelhante a Presentation Machine Learning

An intelligent approach to demand forecasting
An intelligent approach to demand forecastingAn intelligent approach to demand forecasting
An intelligent approach to demand forecasting
Nimai Chand Das Adhikari
 
2016_Apres_Lares_EC_29set2016
2016_Apres_Lares_EC_29set20162016_Apres_Lares_EC_29set2016
2016_Apres_Lares_EC_29set2016
Eduardo Cazassa
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
JawaherAlbaddawi
 

Semelhante a Presentation Machine Learning (20)

Feature selection with imbalanced data in agriculture
Feature selection with  imbalanced data in agricultureFeature selection with  imbalanced data in agriculture
Feature selection with imbalanced data in agriculture
 
Know Your Data: The stats behind your alerts
Know Your Data: The stats behind your alertsKnow Your Data: The stats behind your alerts
Know Your Data: The stats behind your alerts
 
Supply chain design and operation
Supply chain design and operationSupply chain design and operation
Supply chain design and operation
 
Penn State Guest Lecture: Business Forecasting in Real Life
Penn State Guest Lecture: Business Forecasting in Real LifePenn State Guest Lecture: Business Forecasting in Real Life
Penn State Guest Lecture: Business Forecasting in Real Life
 
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllisterOSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
OSMC 2023 | Know your data: The stats behind your alerts by Dave McAllister
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research Paper
 
An intelligent approach to demand forecasting
An intelligent approach to demand forecastingAn intelligent approach to demand forecasting
An intelligent approach to demand forecasting
 
Intro to ml_2021
Intro to ml_2021Intro to ml_2021
Intro to ml_2021
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
2016_Apres_Lares_EC_29set2016
2016_Apres_Lares_EC_29set20162016_Apres_Lares_EC_29set2016
2016_Apres_Lares_EC_29set2016
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
 
Lecture 2 MAED 2023 modelling tools and energy system analysis.pptx
Lecture 2 MAED 2023 modelling tools and energy system analysis.pptxLecture 2 MAED 2023 modelling tools and energy system analysis.pptx
Lecture 2 MAED 2023 modelling tools and energy system analysis.pptx
 
Graphs and Financial Services Analytics
Graphs and Financial Services AnalyticsGraphs and Financial Services Analytics
Graphs and Financial Services Analytics
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Post Graduate Admission Prediction System
Post Graduate Admission Prediction SystemPost Graduate Admission Prediction System
Post Graduate Admission Prediction System
 
BI PPT Finale
BI PPT FinaleBI PPT Finale
BI PPT Finale
 
capacity planning and Forecasting.pptx
capacity planning and Forecasting.pptxcapacity planning and Forecasting.pptx
capacity planning and Forecasting.pptx
 

Presentation Machine Learning

  • 1. Slide 1 of 52 Economic and Financial Forecasting Using Machine Learning Methodologies Periklis Gogas Theophilos Papadimitriou
  • 2. Slide 2 of 52 The framework • Research Grant “Thales” – awarded to T. Papadimitriou • Started • Concludes November 2015 • Dissemination of research results
  • 3. Slide 3 of 52 Presentation Outline • Intuition for using Machine Learning in Forecasting • Forecasting applications: • Exchange rate forecasting • Forecasting House Prices in the U.S. • Forecasting Bank Failures and stress testing in the U.S. • Forecasting Recession
  • 4. Slide 4 of 52 Why Machine Learning? • Renewed interest in machine learning • Higher volumes and varieties of available data • Affordable data storage. • More importantly: computer processing is cheaper and more powerful
  • 5. Slide 5 of 52 Application 1: Forecasting Exchange Rates • Driven by microeconomic factors: Markets, demand & supply High frequency (daily) • Driven by Macroeconomics: e.g. Monetary Exchange Rate models Lower frequency (monthly) Journal of Forecasting, 2015, vol. 34 (7), pp. 560-573 • 2 exchange rate frequencies • Theory: different data generating processes • Is this confirmed?
  • 6. Slide 6 of 52 Methodologies Machine Learning Econometrics • ARCH • GARCH • EGARCH • AR • ARMA • ARIMA • AFRIMA • Random Walk • Artificial Neural Networks • Support Vector Regression
  • 7. Slide 7 of 52 Overview: Hybrid Method
  • 8. Slide 8 of 52 Decomposition: Ensemble Empirical Mode Decomposition
  • 9. Slide 9 of 52 Variable Selection: Multivariate Adaptive Regression Splines (MARS) knot • Breaks dataset into subsamples • Identifying observations called knots • Assigns weights to variables • Selects the most informative
  • 10. Slide 10 of 52 +ε -ε 0 ζ Forecasting: Support Vector Regression • Fit error tolerance band • Higher flexibility than fitting a simple line • Defined according to subset of observations called support vectors
  • 11. Slide 11 of 52 The Kernel Projection K(x1,x2) • We fit a linear model • Real phenomena usually non-linear • Project to higher dimensions • Find a dimensional space where a linear error tolerance band is defined • Re-project to original space and obtain non-linear error tolerance band
  • 12. Slide 12 of 52 Kernels Used Linear RBF Polynomial Sigmoid 2121 ),( xxxxK T  2 21 ),( 21 xx exxK    dT rxxxxK )(),( 2121   )tanh(),( 2121 rxxxxK T  
  • 13. • Separate data • Train data: used to obtain the optimum model • Test data: never seen by the optimum model, used for out-of- sample forecasting Train data Test data Building the Model: Split the dataset into two parts
  • 14. Overfitting and Cross-Validation • The problem of overfitting: • High accuracy in a specific sample • Low generalization ability • Solution: use k-part Cross-Validation • Example of a 3-part CV
  • 15. Training Testing Model evaluation model Training Testing model Training Testing model Initial Dataset Overfitting and Cross-Validation
  • 16. Slide 16 of 52 Forecasting Exchange Rates The data used: •5 rates of various trading volumes: •2 high volume rates USD/EUR, USD/JPY •1 medium volume rate NOK/AUD •2 low volume rates ZAR/ PHP and NZD/BRL
  • 17. Slide 17 of 52 Commodities Crude Oil Cotton Lumber Cocoa Coffee Orange Juice Sugar Corn Wheat Oats Rough Rice Soybean Meal Soybean Oil Soybeans Feeder Cattle Lean Hogs Live Cattle Pork Bellies Iron Ore Metals Gold Copper Palladium Platinum Silver Aluminum Zinc Nickel Lead Tin Stock Indices Dow Jones Nasdaq 100 S & P 500 DAX CAC 40 FTSE 100 Nikkei 225 Interest rates T-bill 6 months T-bill 10 years Spread MLP-EURIBOR 3M Spread MLR-Eonia Spread FF-CP Spread FF-EFF EONIA EURIBOR 1 Week ECB Interest rate EURIBOR 1 Month FED rate Technical Analysis variables Five Day Index Moving Average 3 day Moving Average 5 day Moving Average 10 day Moving Average 30 day Macroeconomic Vars all countries CPI Productivity index GDP Trade Balance Unemployment Central Bank Discount rate Long Term Interest Rates Short Term Interest Rate Aggregate money M1, M2 Public Debt Deficit/Surplus of Government Budget Exchange Rates JPY/EUR JPY/USD USD/GBP BRL/NZD NOK/AUD PHP/ZAR EUR/GBP EUR/USD USD Trade Weighted Indices Major partners Broad Index Other Partners Input Variables (192)
  • 18. Slide 18 of 52 Forecasting Exchange Rates EEMD smoothed component vs USD/EUR series
  • 19. Slide 19 of 52 0% 2% 4% 6% 8% 10% 12% 14% 16% EUR/USD USD/JPY AUD/NOK MeanAbsolutePercentageError RW AR-SVR EEMD-AR-SVR EEMD-MARS-SVR EEMD-AR-ANN EEMD-MARS-ANN ARIMA GARCH Out-of-sample forecasting results Daily frequency Autoregressive models, no-fundamentals
  • 20. Slide 20 of 52 0% 1% 2% 3% 4% 5% 6% ZND/BRL ZAR/PHP MeanAbsolutePercentageError RW AR-SVR EEMD-AR-SVR EEMD-MARS-SVR EEMD-AR-ANN EEMD-MARS-ANN ARIMA GARCH Autoregressive models, no-fundamentals Out-of-sample forecasting results Daily frequency
  • 21. Slide 21 of 52 0.0% 0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0% EUR/USD USD/JPY AUD/NOK MeanAbsolutePercentageError RW AR-SVR EEMD-AR-SVR EEMD-MARS-SVR EEMD-AR-ANN EEMD-MARS-ANN ARIMA GARCH Structural Models: Fundamentals Included Out-of-sample forecasting results Monthly frequency
  • 22. Slide 22 of 52 0.0% 0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% ZND/BRL ZAR/PHP MeanAbsolutPercentageError RW AR-SVR EEMD-AR-SVR EEMD-MARS-SVR EEMD-AR-ANN EEMD-MARS-ANN ARIMA GARCH Structural Models: Fundamentals Included Out-of-sample forecasting results Monthly frequency
  • 23. Slide 23 of 52 Conclusions  Best model outperforms the RW model  ML: outperforms all econometric methodologies  Rejection even of the weak form of efficiency  The model captures the different data generating processes that drive exchange rates in short and long run
  • 24. Slide 24 of 52 • Forecast direction: UP or DOWN • Frequency: Daily • Rates: USD/EUR, USD/JPY, USD/GBP and USD/AUD • Data span: January 2, 2013 to December 26, 2013 • Training 200 • Test: 51 Application 2: Forecasting Exchange Rates Directionally Algorithmic Finance, vol. 4 (1-2), pp. 69-79.
  • 25. Slide 25 of 52 • Order Flow Analysis • Problem: data availability • Alternative: www.Stocktwits.com • Investor sentiment Application 2: Forecasting Exchange Rates Directionally
  • 26. Slide 26 of 52 Application 2: Forecasting Exchange Rates Directionally
  • 27. Slide 27 of 52 Past values of the exchange rate The volume of “Bearish” and “Bullish” posts per day The volume of “Bearish”, “Bullish” and total posts per day Past values of the exchange rate and the volumes of “Bullish” and “Bearish” posts per day Past values of the exchange rate, volumes of “Bullish”, “Bearish” and total posts per day Input Variables Sets
  • 28. Slide 28 of 52 Application 2: Methodologies Machine Learning • Artificial Neural Networks • Support Vector Machines • Knn Nearest Neighbors • Boosted Decision Trees Econometrics • Logistic Regression • Naïve Bayes Classifier • Random Walk
  • 29. Slide 29 of 52 Classification: Support Vector Machines
  • 31. Slide 31 of 52 Support Vector Machines
  • 32. Slide 32 of 52 0% 10% 20% 30% 40% 50% 60% 70% 80% USD/EUR USD/JPY USD/GBP USD/AUD DirectionalAccuracy RW SVM Naïve Bayes Knn Adaboost Logitboost Logistic Regression ANN Forecasting Exchange Rates (Short term– microstructural approach - classification)
  • 33. Slide 33 of 52 Conclusions • The machine learning methodologies outperform the RW model • Machine Learning techniques forecast more accurately that the econometric methodologies the out-of-sample direction • Market hype expressed through the volume (total number) of posts improves the total forecasting ability • Alternative informational path than the order flow analysis
  • 34. Slide 34 of 52 Forecasts • 1-10 years ahead • 1890-2012 • 80% - 20% Input Variables • Real GDP per capita • Long and short term interest rate • Population number • Real asset value • Real construction cost • Unemployment / Inflation • Real oil Prices • Fiscal Policy Indicator Methodologies • EEMD – Elastic Net – SVR • Bayesian AR/VAR Application 3: Forecasting the Case & Schiller house price index Economic Modelling, 2015, vol. 45, pp. 259-267.
  • 35. Slide 35 of 52 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Mean Absolute Percentage Error Directional Accuracy RW RW with drift BAR BVAR AR-SVR EN-SVR EEMD-AR-SVR EEMD-EN-SVR Optimum Model
  • 36. Slide 36 of 52 0% 5% 10% 15% 20% 25% 1 2 3 4 5 6 7 8 9 10 MeanAbsolutePercentageError Forecasting Horizon (Years) RW EEMD-AR-SVR Comparison in Alternative Forecasting Horizons EEMD-AR-SVR vs Random Walk
  • 37. Slide 37 of 52 95 115 135 155 175 195 The EEMD-AR-SVR and the Collapse of the Housing Market – Actual vs Forecasted ML model forecasts the sudden drop of house prices in 2006 that sparked the 2007 financial crisis.
  • 38. Slide 38 of 52 Conclusions • ΕΕΜD-SVR forecasts more accurately than all models used in literature the evolution of house prices in out-of-sample forecasting • The proposed model forecasts almost 2 years ahead the actual 2006 sudden drop in house prices
  • 39. Slide 39 of 52 Application 4: Forecasting bank failures and stress testing • 1443 U.S. Banks • 962 solvent • 481 failed Data from FDIC (Federal Deposit Insurance Corporation) Period 2003-2013 The dependent variable • Financial position of a bank (solvent or insolvent) The independent variables • 144 financial variables and ratios for each bank
  • 40. Slide 40 of 52 Variable selection: Local-learning for high-dimensional data (Sun et al. in 2010) Selected variables: • Tier 1 (core) risk-based capital/total assets year t-1 • Provision for loan and lease losses/total interest income year t-1 • Loan loss allowance/total assets year t-1 • Total interest expense/total interest income year t-1 • Equity capital to assets year t-1
  • 41. Slide 41 of 52 Forecasting bank failures and stress testing 96.67% 89.90% 78.13% 50% 60% 70% 80% 90% 100% t-1 t-2 t-3 Out-of-Sample accuracy • 3 forecasting windows
  • 42. Slide 42 of 52 Optimum model: Year t-1 Real Solvent 228 Real Failed 115 Predicted Solvent 223 3 Predicted Failed 5 112 Accuracy 97.39% 97.81% Forecasting accuracy Solvent Misclassified as failed 5 • 4 out of 5 received enforcement action from FDIC • 1 received financial help from the U.S. Treasury as part of the Capital Purchase Program under the Troubled Asset Relief Program Failed misclassified as solvent 3 • 1 discovered with unreported losses and was closed • 2 small banks with total assets $44.5 & $26.3 million respectively
  • 43. Slide 43 of 52 Solvency elasticity – Stress testing Insolvent Bank Space Solvent Bank Space Bank 1 Bank 2 Bank 3 Variable 1 • Banks are mapped on feature space • Distance from the separating hyperplane provides the sensitivity of its state • This property may be used as an auxiliary stress testing methodology
  • 44. Slide 44 of 52 • The behavior of the yield curve is associated with the business cycles. • Indicator of future economic activity. Application 5: Yield Curve and Recession Forecasting
  • 45. • Forecasting: GDP deviations from trend • positive > Inflationary gaps • negative > Unemployment gaps • Literature: Pairs of interest rates, other variables Application 5: The Yield Curve and EU Recession Forecasting International Finance, vol. 18 (2), pp. 207-226
  • 46. • Data span: September 2009 - October 2013 • Methodologies: SVM vs Probit vs ANN • Variables: Eurocoin and interest rates: 3 & 6 months and 1, 2, 3, 5, 7, 10, 20 years. • We tested: • 27 interest rates in pairs • 27 interest rates in triplets • All interest rates Application 5: The Yield Curve and EU Recession Forecasting International Finance, vol. 18 (2), pp. 207-226
  • 47. Application 5: The Yield Curve and EU Recession Forecasting International Finance, vol. 18 (2), pp. 207-226 A B A B D C • Motivation for interest rate triplets: exploit the curvature • AB same slope • ACB and ADB different curvature (concave vs convex)
  • 48. Methodology Kernel Combination Train accuracy (%) Test accuracy (%) Growth accuracy (%) Recession Accuracy (%) Pairs Probit 2Υ-10Υ 82,00 65,00 57,00 78,00 SVM RBF 2Y-20Y 73,68 78,26 93,00 56,00 SVM RBF 3Y-20Y 72,63 78,26 93,00 56,00 SVM Polynomial 3M-5Y 78,94 69,56 50,00 100,00 SVM Polynomial 1Y-2Y 77,89 69,56 50,00 100,00 ΝN 3M-5Y 79,59 65,21 42,85 100,00 Triplets Probit 1Υ-5Υ-10Υ 74,00 61,00 43,00 89,00 SVM RBF 1Y-3Y-20Y 73,68 91,30 86,00 100,00 SVM RBF 3M-2Y-20Y 73,68 78,26 64,00 100,00 SVM Polynomial 1Y-3Y-7Y 74,73 65,22 50,00 89,00 SVM Polynomial 3M-5Y-20Y 71,57 65,22 50,00 89,00 ΝN 6M-5Y-7Y 81,63 65,21 57,14 77,78 Allinterestrates Probit 83,00 65,00 57,00 78,00 SVM Linear 58,94 39,13 0,00 100,00 SVM RBF 74,73 69,56 79,00 56,00 SVM Polynomial 74,73 56,52 50,00 67,00 NN 75,51 69,56 78,57 55,56 Forecasting EU recessions
  • 49. Slide 53 of 52 Augmenting with Fundamentals • Australian dollar/euro • Canadian dollar/euro • Czech corone-euro • Chinese yuan-euro • Dollar-euro • CPI • Oil price • Unemployment • Inflation • Μ1 • Μ2 • Μ3 • Industrial Index • PPI • External liabilities • External demand
  • 50. Slide 54 of 52 73.68 87.36 9091.3 91.3 95.65 0 10 20 30 40 50 60 70 80 90 100 1Y-3Y-20Y Exchange rate austr. $/ euro Μ1 Train Accuracy (%) Test accuracy (%) Forecasting EU recessions
  • 51. Slide 55 of 52 Thank you Acknowledgments This research has been co-financed by the European Union (European Social Fund – ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.