Text mining tweets on e-cigarette risks and benefits using machine learning following a vaping related lung injury outbreak in the USA

Lamiece Hassan, Mohab Elkaref, Geeth de Mel, Ilze Bogdanovica, Goran Nenadic

Research output: Contribution to journalArticlepeer-review


Electronic nicotine delivery systems (ENDS) (also known as ‘e-cigarettes’) can support smoking cessation, although the long-term health impacts are not yet known. In 2019, a cluster of lung injury cases in the USA emerged that were ostensibly associated with ENDS use. Subsequent investigations revealed a link with vitamin E acetate, an additive used in some ENDS liquid products containing tetrahydrocannabinol (THC). This became known as the EVALI (E-cigarette or Vaping product use Associated Lung Injury) outbreak. While few cases were reported in the UK, the EVALI outbreak intensified attention on ENDS in general worldwide. We aimed to describe and explore public commentary and discussion on Twitter immediately before, during and following the peak of the EVALI outbreak using text mining techniques. Specifically, topic modelling, operationalised using Latent Dirichlet Allocation (LDA) models, was used to discern discussion topics in 189,658 tweets about ENDS (collected April–December 2019). Individual tweets and Twitter users were assigned to their dominant topics and countries respectively to enable international comparisons. A 10-topic LDA model fit the data best. We organised the ten topics into three broad themes for the purposes of reporting: informal vaping discussion; vaping policy discussion and EVALI news; and vaping commerce. Following EVALI, there were signs that informal vaping discussion topics decreased while discussion topics about vaping policy and the relative health risks and benefits of ENDS increased, not limited to THC products. Though subsequently attributed to THC products, the EVALI outbreak disrupted online public discourses about ENDS generally, amplifying health and policy commentary. There was a relatively stronger presence of commercially oriented tweets among UK Twitter users compared to USA users.

Original languageEnglish
Article number100066
JournalHealthcare Analytics
Early online date7 Jun 2022
Publication statusPublished - Nov 2022


  • ENDS
  • Machine learning
  • Public health
  • Social media
  • Twitter
  • UK
  • USA
  • e-cigarettes


Dive into the research topics of 'Text mining tweets on e-cigarette risks and benefits using machine learning following a vaping related lung injury outbreak in the USA'. Together they form a unique fingerprint.

Cite this