Download PDFOpen PDF in browser

Performance Evaluation of Distributed Machine Learning for Load Forecasting in Smart Grids

EasyChair Preprint no. 2411

6 pagesDate: January 18, 2020


Load forecasting in smart grids is the process of predicting the electrical power to meet the short-term, medium-term and long-term demands. Highly accurate load forecasting helps the electrical utilities to manage their production, operations, control and management of grids to satisfy the electrical demand. Most of the state-of-the-art methodologies utilize classical machine learning algorithms to predict the electrical load. There is a need that big data platforms and parallel distributed computing are utilized to their potential in the available solutions. In this paper, the Apache Spark and Apache Hadoop are utilized as big data platforms for distributed computing. These are more efficient when it comes to parallel computing. In our paper, MLib, Apache Spark library for machine learning algorithms, has been utilized for distributed computing. Using MLib, the classic regression algorithms like linear regression, generalized linear regression, decision tree, random forest and gradient-boosted trees have been tested in addition to survival regression and isotonic regression. The obtained results show that Apache Spark produces high accuracy while parallelizing the process of load forecasting in highly competent training and test times.

Keyphrases: Apache Spark, distributed computing, Distributed Machine Learning, load forecast, Smart Grids

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Dabeeruddin Syed and Shady S. Refaat and Haitham Abu-Rub},
  title = {Performance Evaluation of Distributed Machine Learning for Load Forecasting in Smart Grids},
  howpublished = {EasyChair Preprint no. 2411},

  year = {EasyChair, 2020}}
Download PDFOpen PDF in browser