# Load Forecasting for Months of the Lunar New Year Holiday Using Standardized Load Profile and Support Vector Regression

Load forecasting plays an important role in building business strategies, ensuring reliability and safe operation for any electrical system.

There are many different methods, including: regression models, time series, neural networks, expert systems, fuzzy logic, machine learning and statistical algorithms used for short-term forecasts. However, the practical requirement is how to minimize the forecast errors to prevent power shortages or wastage in the electricity market and limit risks.

For Asian countries (such as Vietnam) that use lunar calendar, one of the most difficult and unpredictable issues is the Lunar New Year (usually in late January or early February). There is a deviation between the solar calendar and the lunar calendar (the load models are not identical).

Therefore, it often leads the forecast results of algorithm for this period with large errors. The paper proposes a method of short-term load forecasting by constructing a Standardized Load Profile (SLP) based on the past electrical load data, combining machine learning algorithms Support Regression Vector (SVR) to improve the accuracy of load forecasting algorithms.

**1. Introduction**

Load forecasting is a topic of electrical systems which has been studied for a long time. There are two main approaches in this area: Traditional statistical modeling of the relationship between load and factors affecting load (such as time series, regression analysis, etc) and artificial intelligence, machine learning methods. Statistical methods assume load data according to a sample and try to forecast the value of future loads using different time series analysis techniques. Intelligent systems are derived from mathematical expressions of human behavior / experience.

Especially since the early 1990s, neural networks have been considered one of the most commonly used techniques in the field of electrical load forecasting, because it assumes that there is a nonlinear function related to historical values and some external variables with future values may affect the output (Shyamali and Chawalit, 2016). The approximate ability of neural networks has made their applications popular.

In recent years, an intelligent calculation method involving Support Vector Machines has been widely used in the field of load forecasting. In 2001, Bo-Juen Chen, Ming-Wei Chang, and Chih-Jen Lin used the Support Vector Regression technique to solve the electrical load prediction problem (forecasting a maximum daily load of the next 31 days). This was a competition organized by EUNITE (European Network on Intelligent Technologies for Smart Adaptive Systems).

Information was provided includes: demand data of the past two years, daily temperature of the past four years and local holiday events. Data was divided into 2 parts: a part used for training (about 80 – 90%) and the rest used for algorithm testing (about 20-10%). The set of training inputs included: data of the previous day, the previous hour, the previous week, the average of the previous week. Their approach in fact won the competition (Juan et al., 2016).

Since then, there have been several studies exploring the different techniques used for optimizing SVR to perform load forecasting (Lemuel Clark et al., 2015; Electricity Load Forecasting; Nguyen et al., 2018; Ceperic et al., 2013; Vapnik, 1995; Gunn, 1998; Cherkassky and Ma, 2002; Basak et al., 2007). The main reason for using SVM in load forecasting is that it can easily model the load curve, the relationship between the load and the dynamics of changing load demand (such as temperature, economic and demographics).

However, there are some problems encountered when the above algorithms apply to reality:

– Climate conditions always play an important role in load forecasting. They show the relationship between climate and load demand, when we do the load forecasting for the post-test period, it is very difficult to forecast the values of weather and climate used as the input of the algorithm and these values are often not available.

– Electrical load samples include hidden elements, which tend to be similar to the previous load model. However, it will lead to a false forecast of the following days if the date pattern is different from the previous day or there is an event that impacts. Therefore, the use of the dataset (training inputs included: data of the previous day, the previous hour, the previous week, the average of the previous week) has many risks if the load models are not identical.

– If the forecast time frame is greater than the past data frame (more than 07 days due to the algorithm data is the previous week’s values), there will be a lack of input to run the algorithm.

– In addition, for Asian countries (such as Vietnam) that use lunar calendar, one of the most difficult and unpredictable issues is the Lunar New Year (usually in late January or early February), or the lunar calendar (Hung King’s Anniversary), etc. There is a deviation between the solar calendar and the lunar calendar (the load models are not identical). Therefore, it often leads the forecast results of algorithm for this period with large errors.

For this reason, the paper proposes a solution to build a Standardized Load Profile (SLP) based on the historical load dataset as a training dataset. This input dataset is combined with the Support Vector Regression algorithm (SVR) to improve the accuracy of short-term forecast results, solve the problem of deviation between the solar and the lunar calendar, as well as overcome the input data frame.

SLP will be built for all 365 days and 8,670 cycles in a year. SLP will be an important dataset during training, testing and forecasting process. SLP will be built for all 365 days and 8,670 cycles in 1 year. SLP will be an important set of data during training, testing and forecasting. SLP will standardize load models: by hours, by days, by seasons, and by special day types (including lunar dates). Therefore, SLP will contribute to solve the above-mentioned difficulties and improve the quality of electrical load forecasting.

**2. Methodology**

Observing the load profiles of February of Ho Chi Minh City over the years, a huge fluctuation in chart shape over the years can be seen. This results in the use of historical data for forecasting this period of time is extremely complicated.

In fact, the algorithms used to forecast in Vietnam have to go through an intermediary which converts these months into regular months (without holidays, Lunar New Year). After being calculated, the forecast result will be reversed or the result will be accepted with a large error. Commercial software provided by foreign countries all have this problem.

**2.1 Standardized Load Profiles (SLP)**

The Standardized Load Profile is an electrical load profile according to the relative values, converted from the total power consumption during the electrical load research cycle. The standardized load profle of day / month / year of each electrical load sample is constructed by dividing the load profile of a sample (from the measured data collected by day / month / year) by the power consumption of day / month / year of the sample.

Considering the load profiles of the days in a week and some special holidays of the year in Ho Chi Minh City area,the difference between weekdays (from Tuesday to Friday) can be ignored and they have the same load chart. For the load profiles on Monday, they are different from the normal days at 0:00 to 9:00, due to the forwarding demand from Sunday.

For load profiles on Saturday, there is an insignificant change compared to normal days, mainly the load demand decreases in the evening due to the start of the weekends. Particularly for load profiles on Sunday, it is completely different from normal days (the demand for electricity is low).

When observing the load chart of the New Year and Lunar New Year, a noticeable difference can be seen where the graphs are almost flat and the load demand is quite low because these are holidays.Particularly on Lunar New Year, the load demand is the lowest since this is the longest holiday of the year (maybe from 6 to 9 days).

Standardized Load Profiles (SLP) are built by taking the value of the collected capacity in a 60-minute period divided by its maximum capacity. We need to build SLP for 365 days per year. Some typical SLP:

Based on the SLP of each cycle of the past data set, we can build the SLP data set for future forecast periods. This should be accurate to each cycle, each type of day (holidays, weekdays, working days, holidays, etc.), each week and month. Therefore, the standardized load profiles (SLP) is a special feature and is also an important input parameter of the SVR (NN) machine learning algorithms training process to rebuild the load curves, from which we can estimate the amount of lost or not recorded data during the measurement process.

**2.2 Support vector regression (SVR)**

The SVM was proposed by Vapnik in to solve the data classification problem. Two years later, the proposed version of SVM was successfully applied to non-linear regression problems. This method is called support vector regression (SVR) and it is the most common form of SVMs.

The goal of SVR is to create a model that predicts unknown outputs based on known inputs. During training, the model is formed based on the known training data set (x1, y1), (x2, y2), …, (xn, yn), where xi is input vectors and yi is output vectors. During the test period, the model was trained on the basis of new inputs x1, x2, …, xn to make predictions about unspecified outputs y1, y2, …, yn.

Consider a known training set {xk, yk}, k = 1, …, N with input vectors xk ∈ Rn and scalar output vectors yk ∈ R. The following regression model can be developed by using the nonlinear mapping function φ (.): Rn → Rnh to map the input space into a multidimensional characteristic space and build linear regression in it, as shown in (1):

(1)

Where ω represents the weight vector and b is the deviation. The optimization problem is formed in the original space in (2):

(2)

that subjects to the constraints shown in (3):

(3)

where xi is mapped in a multidimensional vector space with the mapping φ, ξi is the upper limit of the training error and lower. C is the constant that determines the error cost, that is, the tradeoff between the complexity of the model and the accepted larger degree of error. The parameter ε includes the width of the non-sensitive area, which is used to match the training data.

The parameters C and ε are not known in advance and must be determined by some mathematical algorithm applied on the training set (eg Grid – Search and Cross – Validation). The goal of the SVR is to place many input vectors x i inside the pipe shown in Figure 4. If the xi is not in the tube, the errors will occur.

To solve the optimization problem identified by (2) and (3), it is necessary to develop a dual problem using Lagrange function, the weight vector ω and the deviation b. The SVM results for the regression model in the double form are shown in (4), the Lagrange multipliers, K (xi, x) represents the Kernel function, defined as a midpoint.

(4)

The Kernel functions allow the calculation of dot product in a feature space of height using the input data from the original space, without explicit computation φ (x). The Kernel function is often used in non linear regression problems, which is used in this study, as the radial basis function (RBF) presented in (5):

(5)

where γ represents the Kernel parameter, which should also be determined by mathematical algorithms. More information about SVR can be found in.

**2.3 Research models**

Processed historical data (power consumption, capacity, temperature recorded at 24 cycles – 60 minutes each) with the Standardized load Profiles (SLP) will be included in modules to build regression functions under SVR (Support Vector Regression), NN (Neural Network) algorithms to build regression functions.

Then we use the above data set to check and evaluate the error of regression functions. After that we choose the regression function with the smallest error which will be used as regression function for the next forecast phase. The SLP data set in 24 cycles of the expected period (including holidays, etc.) and the forecasted temperature in 24 cycles of the corresponding period will be the input for the regression function that is selected to export forecast results in 24 cycles for a period of 7 – 30 days.

**3. Results and Discussion**

**3.1 Input data**

The article uses data from January 1, 2015, to November 17, 2018, of EVNHCMC to run test models. After pretreatment, the data set is divided into 2 volumes: training set and test set, in which the test set is the last 30 days of the data set. Or the data set is divided into phases to test the forecast results in different time periods.

Input data for training algorithms include: capacity (Pmax/Pmin) in 60 minute cycles; temperature (max / min) in 60-minute cycles; standardized load profiles of 24 hours of day; list of holidays and Lunar New Year in the forecast year. A useful measurement parameter is the mean absolute percentage error (MAPE) which is used to evaluate the error of models.

(6) </p”>

The algorithms are programmed in Mathlab language and the results are exported to Excel files for data exploitation.

**3.2 SVR Models**

Processed historical data (power consumption, capacity, temperature recorded at 24 cycles – 60 minutes each) with the Standardized load Profiles (SLP) will be included in modules to build SVR models, with: normalization coefficient C, width of pipe ε and Kernel function; 4 typical SVR model parameters are proposed:

**3.3 Results and analysis**

Run the forecast results for February 2018 (the month of the Lunar New Year) to assess the degree of error of the models.

**3.3.1 The model with inputs included: data of the previous day, the previous hour, the previous week and the average of the previous week.**

Processed historical data (power consumption, capacity, temperature recorded at 24 cycles – 60 minutes each) with the Standardized load Profiles (SLP) will be included in modules to build regression functions under SVR, Neural Network and Random Forest algorithms to build regression functions.

We choose the regression function with the smallest error whichwill be used as regression function for the next forecast phase.The model Yts4 is selected to be a forecasting model.

– Forecast results for February 2018. Considering the forecast results for February of the model, we see a big difference between reality and forecasting. The reason is that we used the historical data of January 2019 (7-14-30 days before the forecasting date) as the input for the training model.

**3.3.2 SLP – SVR combination model**

– Results of testing SVR models. The results in this Table 3, 4 are the test run results of the regression models being developed. The evaluation of the MAPE results of the models aims to select a standard model for the official forecasting model in the later stage. Considering models Yts2 (2.41%) and YtRF (2.60%), they all have quite low error results. However, when considering the error according to each component, the model Yts2 has more advantages and the error of each component is also lower than YtRF. Therefore, it is appropriate for the author to choose the model Yts2.

We choose the regression function with the smallest error which will be used as regression function for the next forecast phase. The model Yts2 is selected to be a forecasting model.

**4. Conclusion**

Observe the experimental results in the forms of testing datasets (load data sets of the previous day, the previous week, the previous month and the dataset of Standardized Load Profile – SLP), we see the results of the SLP-SVR model are close to the actual value of February 2018, while the results of the old model are in quite large deviation.

Thus, through experimentation, we see that the use of Standardized Load Profile (SLP) as the input dataset for modules of the forecasting regression function is effective and give forecasting results with low errors. It solves the problem of deviation between the solar and the lunar dates, especially in the months of lunar new year, as well as resolving the difference between the solar and lunar cycles.

**References**

[1] Basak, D., Pal, S., Patranabis, D.C. 2007. Support Vector Regression, Neural Information Processing – Letters and Reviews, Vol. 11, No. 10, pp. 203 – 224.

[2] Ceperic, E., Ceperic, V., and Baric, A. 2013. A strategy for short-term load forecasting by support vector regression machines, IEEE Transactions on Power Systems, vol. 28, pp. 4356–4364, Nov. 2013.

[3] Cherkassky, V., Ma, Y. 2002. Selection of Meta-parameters for Support Vector Regression, International Conference on Artificial Neural Networks, Madrid, Spain, Aug. pp. 687 – 693.

[4] Electricity Load Forecasting for the Australian Market Case Study version 1.3.0.1 by David Willingham.

[5] Gunn, S.R. 1998. Support Vector Machines for Classification and Regression, Technical Report, Image Speech and Intelligent Systems Research Group, University of Southampton.

[6] Juan H.,Tingting S. and Jing C. 2016. Comparison of Random Forest and SVM for Electrical Short-term Load Forecast with Different Data Sources. 978-1-4673-9904-3/16/$31.00 ©IEEE.

[7] Lemuel Clark P. V., Christelle R. V. and Jerald Aldin A. D. 2015. Next Day Electric Load Forecasting Using Artificial Neural Networks. 8th IEEE International Conference Humanoid, Nanotechnology, Information Technology Communication and Control, Environment and Management (HNICEM). The Institute of Electrical and Electronics Engineers Inc. (IEEE) – Philippine Section 9-12 December 2015 Water Front Hotel, Cebu, Philippines.

[8] MHMR Shyamali Dilhani and Chawalit Jeenanunt. 2016. Daily electric load forecasting: Case of Thailand. 7th International Conference on Information Communication Technology for Embedded Systems 2016 (IC-ICTES 2016). 978-1-5090-2248-9/16/$31.00 ©IEEE.

[9] Nguyen T.D., Tran T.H., Nguyen T.P. 2018. Page(s):90 – 95. 10.1109/GTSD.2018.8595514. 4th International Conference on Green Technology and Sustainable Development (GTSD).

[10] Smola, A.J., Schölkopf, B. 2004. A Tutorial on Support Vector Regression, Statistics and Computing, Vol. 14, No. 3, pp. 199 – 222.

[11] Understanding Support Vector Machine Regression and Support Vector Machine Regression.

[12] Vapnik, V. 1995. “The nature of statistical learning theory.” Springer, NY.

Author: **Tuan-Dung Nguyen**: Ho Chi Minh Power Company (EVNHCMC), Ho Chi Minh City, Vietnam

Author: **Thanh-Phuong Nguyen**: HUTECH Institute of Engineering, Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Vietnam

**Keywords**: load forecasting; regression model; lunar new year; standardized load profile; support vector regression

You are viewing the article:

**Load Forecasting for Months of the Lunar New Year Holiday Using Standardized Load Profile and Support Vector Regression: Case Study Ho Chi Minh City**

Link https://tampacific.com/jtin/load-forecasting-for-months-of-the-lunar-new-year-holiday-using-standardized-load-profile-and-support-vector-regression.html

Next article: **Effects of Different Maturity Stages on the Quality of Purple Passion Fruit**

* This work is licensed under a **CC-BY 4.0 International License**.