Realised Volatility Forecasting: Machine Learning via Financial Word Embedding

Research output: Working paper


We develop FinText, a novel, state-of-the-art, financial word embedding from Dow Jones Newswires Text News Feed Database. Incorporating this word embedding in a machine learning model produces a substantial increase in volatility forecasting performance on days with volatility jumps for 23 NASDAQ stocks from 27 July 2007 to 18 November 2016. A simple ensemble model, combining our word embedding and another machine learning model that uses limit order book data, provides the best forecasting performance for both normal and jump volatility days. Finally, we use Integrated Gradients and SHAP (SHapley Additive exPlanations) to make the results more 'explainable' and the model comparisons more transparent.
Original languageEnglish
Publication statusPublished - 29 Jul 2021


  • Realised Volatility Forecasting; Machine Learning; Natural Language Processing; Word Embedding; Explainable AI; Dow Jones Newswires; Big Data

Research Beacons, Institutes and Platforms

  • Institute for Data Science and AI
  • Digital Futures


Dive into the research topics of 'Realised Volatility Forecasting: Machine Learning via Financial Word Embedding'. Together they form a unique fingerprint.

Cite this