3. Prediction Methods
Analytical Numerical Empirical
Prediction
Method
CFD solverPhysics
Machine
learning
It’s
complicated
Model setup Data
Data Science Symposium 2019
4. Prediction Methods: Machine Learning
Neural networks Decision tree based
Overtopping
F1<2
0.50.2
Data Science Symposium 2019
5. Gradient Boosting with decision trees
Calculate residuals
Construct tree i
Construct tree i+1 to
improve high error
samples
Combine trees
Data Science Symposium 2019
13. Workflow
Create subsamples
Train 500 models
Train Set
Test 500 models
500 predictions for
each target
Do some stats:
mean, min, max
Split into train/test
Data Science Symposium 2019
14. Workflow
Create subsamples
Train 500 models
Train Set
Test 500 models
500 predictions for
each target
Do some stats:
mean, min, max
Split into train/test
Data Science Symposium 2019
15. Is the model any good? Compare!
Neural networks
Overtopping
• Overtopping 2.04
• UNIBO
Data Science Symposium 2019
21. Conclusions
Data Science Symposium 2019
• Gradient boosting outperforms neural nets
• Model spread
• RMSE
• SKLearn Gradient boosting not as good as XGBoost
• Multi-colinear features should not be removed for XGBoost
• XGBoost is exchangeable