Application of Machine Learning to Star Formation

  • Joseph Mwatukange

Student thesis: Master of Science by Research


This dissertation applies machine learning to a dataset of spectral lines from the Atacama Large Millimetre/sub-millimetre Array (ALMA) telescope of the complex organic molecule, methyl cyanide (CH3CN). In terms of finding or forecasting interesting events or patterns, big data analysis presents new challenges for astronomy. In this work, we de- sign and implement a spectral data compression technique using discrete wave transform (DWT) and present ensemble machine learning (ML) models as a method for obtaining the physical parameters of CH3CN line emissions such as excitation temperature, column density, source size, FWHM, and velocity gradients. A random forest regressor and an extreme gradient boosting (xgboost) regressor were both used as ML techniques.The ML models were trained using synthetic data, and they performed well, with the tuned xgboost having the highest accuracy of all models. After applying the ML models to the observational data, our analysis revealed that they performed poorly in reconstructing spectra that perfectly matched the observations. Finally, we then investigate possible reasons for this poor performance.
Date of Award1 Aug 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorChristopher Conselice (Supervisor) & Gary Fuller (Supervisor)

Cite this