Abstract
This paper examines, for the first time, the performance of machine learning models in realised volatility forecasting using big data sets such as LOBSTER limit order books and news stories from 'Dow Jones News Wires' for 28 NASDAQ stocks over a sample period of June 28, 2007, to November 17, 2016. We find strong evidence to support ML forecasting power dominating an extended CHAR and all other HAR-family of models using evaluation measures such as MSE, QLIKE, MDA and RC values. The LOB-ML has very strong forecasting power and adding News sentiment variables to the data set only improves the forecasting power marginally. However, the good forecasting performance of ML models is relevant only for normal volatility days (i.e. 90% of the out-of-sample period). Throughout the study, we find a persistent trade-off between normal vs jump day forecasting; one model serves well for normal days performs poorly for jump days, and vice versa.
Original language | English |
---|---|
Publisher | Social Science Research Network |
Number of pages | 51 |
DOIs | |
Publication status | Published - 12 Oct 2020 |
Keywords
- Realised Volatility Forecasting
- Machine Learning
- Long Short-Term Memory
- Heterogeneous AutoRegressive (HAR) Models
- Limit Order Book (LOB) Data
- Dow Jones Corporate News
- Big Data
Research Beacons, Institutes and Platforms
- Institute for Data Science and AI