Energy Consumption Prediction Using Data Reduction and Ensemble Learning Techniques
DOI:
https://doi.org/10.5614/itbj.ict.res.appl.2022.16.3.1Keywords:
bagging, boosting, data reduction, dimensionality reduction, energy efficiency, ensemble learning, LightGBM, numerosity reductionAbstract
Building energy problems have various kinds of aspects, one of which is the difficulty of measuring energy efficiency. With current data development, energy efficiency measurements can be made by developing predictive models to estimate future building needs. However, with the massive amount of data, several problems arise regarding data quality and the lack of scalability in terms of computation memory and time in modeling. In this study, we used data reduction and ensemble learning techniques to overcome these problems. We used numerosity reduction, dimension reduction, and a LightGBM model based on boosting added with a bagging technique, which we compared with incremental learning. Our experimental results showed that the numerosity reduction and dimension reduction techniques could speed up the training process and model prediction without reducing the accuracy. Testing the ensemble learning model also revealed that bagging had the best performance in terms of RMSE and speed, with an RMSE of 262.304 and 1.67 times faster than the model with incremental learning.
Downloads
References
National Energy Council, Indonesia Energy Outlook 2019, Jakarta: Secretariat General, National Energy Council, 2019.
Rong, H., Zhang, H., Xiao, S., Li, C. & Hu, C., Optimizing Energy Consumption for Data Centers, Renewable and Sustainable Energy Reviews, 58, pp. 674-691, 2016.
Shaohua, W. & Hua., Y., Comparison among Methods of Ensemble Learning, Proceedings - 2013 International Symposium on Biometrics and Security Technologies, ISBAST 2013, pp. 286-290, 2013. DOI: 10.1109/ISBAST.2013.50.
Garc-Gil, D., Alcalde-Barros, A., Luengo, J., Garc, S., & Herrera, F., Big Data Preprocessing as the Bridge between Big Data and Smart Data: BigDaPSpark and BigDaPFlink Libraries, in IoTBDS, pp. 324-331, 2019. DOI: 10.5220/0007738503240331.
Kalegele, K., Takahashi, H., Sveholm, J., Sasai, K., Kitagata, G. & Kinoshita, T., Numerosity Reduction for Resource Constrained Learning, Journal of Information Processing, 21(2), pp. 329-341, 2013.
Pandey, K. & Shukla, D., Optimized Sampling Strategy for Big Data Mining Through Stratified Sampling, International Journal of Scientific & Technology Research, 8(11), pp. 3696-3702, 2019.
Han, J., Kamber, M. & Pei, J., Data Mining: Concepts and Techniques, 3rd ed., Amsterdam: Elsevier/Morgan Kaufmann. 2012.
Langlois, D., Chartier, S. & Gosselin, D., An Introduction to Independent Component Analysis: Infomax and Fastica Algorithms, Tutorials in Quantitative Methods for Psychology, 6(1), pp. 31-38, 2010.
Van Der Maaten, L., Postma, E. & Van den Herik, J., Dimensionality Reduction: A Comparative Review, Journal of Machine Learning Research, 10, pp. 66-71, 2009.
Gepperth, A. & Hammer, B., Incremental Learning Algorithms and Applications. In: European symposium on artificial neural networks (ESANN), pp. 357-368, 2016.
Verleysen, M., & Franis, D., The Curse of Dimensionality in data Mining and Time Series Prediction, in: International Work-conference on Artificial Neural Networks, pp. 758-770. Springer, Berlin, Heidelberg, 2005.