Development of a Holistic Machine Learning-Based Approach for Building Energy Consumption Prediction under Limited Data Conditions

  • Qingyao Qiao

Student thesis: Phd


Machine learning (ML) methods have been widely applied in predicting energy consumption of buildings. As data-intensive methods, the performance of prediction to a great extent depends on the quality of data. Lacking input features of data will render underfitting problems that significantly impede prediction performance. Currently, a considerable number of buildings are suffering from data availability issues, due to underperforming building energy management systems. A comprehensive understanding of the implications of accurately predicting the energy consumption of buildings using ML methods with limited data is essential for building energy efficiency and energy planning. However, the research in this area is still at the preliminary stage. In order to alleviate the difficulties caused by the lack of data, a comprehensive framework consisting of feature creation and feature selection is developed in this thesis, whereby feature creation is used to expand the dimensionality of the original limited data (e.g., meteorological data and time information), while feature selection is implemented to select the most relevant data. In this thesis, 3 distinct buildings with different functions at the University of Manchester have been selected as case studies in order to evaluate the generalisation capabilities of the proposed framework. Meteorological data (e.g., temperature, apparent temperature, relative humidity, global solar radiation, indirect solar radiation, wind speed, wind direction and cloud level) was employed to predict the hourly electricity consumption of the three buildings. A variety of feature creation were initially implemented including extracting time information from meteorological data, considering the impact of delay effect of weather data on energy consumption and decomposing the weather data with empirical mode decomposition. In addition, considering the pivotal role of occupant behaviour in energy consumption, an occupant behaviour simulation module based on Agent-based modelling was developed to simulate the indoor electricity-related behaviour of students. The dimension of data was significantly extended with the above feature creation methods. In terms of feature selection, a variety of filter and wrapper feature selections were implemented on the extended dataset generated by the aforementioned feature creation methods. The results indicated that wrapper feature selection outperformed filter feature selection methods in determining the most important feature subset and the performance of ML methods was significantly improved by using the selected feature subset than using original data.
Date of Award1 Aug 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorRodger Edwards (Supervisor) & Akilu Yunusa-Kaltungo (Supervisor)

Cite this