How to Accurately Identify Commercial and Industrial Energy Storage Users in Peak-Valley Arbitrage Models?
In recent years, the number of user-side energy storage systems has seen significant growth, driven by supportive development policies. In this context, the challenge of deeply exploring existing energy storage users and implementing refined management has emerged as an important area of research for load regulation in new power systems.
This article aims to identify commercial and industrial user-side energy storage systems within peak-valley arbitrage models. It combines energy storage operation strategies with load curve variations to establish a typical feature index system. Through MiniBatch K-Means clustering, random forest feature selection, and multi-dimensional cross-iteration models, we can accurately identify existing energy storage users. This contributes to the efforts of grid enterprises in investigating and managing user-side energy storage resources.
Challenges in Identifying User-Side Energy Storage
In April 2024, the National Energy Administration issued a notice to promote the grid connection and scheduling operation of new energy storage systems, aiming to optimize their management and enhance their effectiveness in supporting the construction of new power systems. User-side energy storage possesses significant cost-reducing and efficiency-enhancing capabilities. By shifting electricity consumption from peak to valley periods, it can effectively reduce high electricity costs, integrate with solar power to improve daytime renewable energy absorption, and lower carbon emissions. This not only promotes environmental protection and sustainable development but also balances grid loads, reduces the costs of expanding power supply capacity, and presents practical, valuable, and scalable opportunities for users and the grid.
However, early investments in user-side energy storage were largely driven by enterprises without mandatory grid management requirements. As a result, power companies have limited knowledge of operational energy storage user information, making it difficult to mobilize storage resources for grid interaction. Users also struggle to adjust their energy storage operation strategies in response to grid gaps and subsidy policies, which limits the economic potential of energy storage.
To reduce the difficulty of information verification, it is necessary to analyze changes in user electricity consumption behaviors before and after the installation of energy storage systems. This can help construct an identification model for existing storage users, assisting grassroots operations to efficiently carry out user investigations and support refined load resource management by the grid.
Current Issues in Energy Storage User Identification
The identification of energy storage users currently faces several challenges:
- Sparse Research Samples: Early management approaches led to limited control over energy storage users by power companies. On-site verification requires user cooperation and is time-consuming, resulting in insufficient sample data for analysis. This scarcity hinders the identification of patterns, increases the difficulty of algorithm learning, and affects the accuracy and generalization capability of identifications.
- Complexity of Electricity Consumption Characteristics: Large users experience significant fluctuations in their electricity loads due to operational influences. Additionally, various factors complicate the analysis of electricity consumption characteristics. For instance, when energy storage capacity is small, the load regulation effect of storage may be overshadowed by the user’s inherent consumption adjustments. Many users already implement peak-shifting strategies to reduce electricity costs, making it difficult to distinguish between their load regulation and that of energy storage. A high proportion of large users have installed solar power systems, and fluctuations in generation can significantly impact daytime loads, further complicating the analysis.
- High Flexibility of Energy Storage Charging and Discharging: While energy storage devices offer controllable input and output power, they do not always operate at a constant power rate, exhibiting various charging and discharging profiles. Currently, power companies lack fine-grained monitoring of internal electricity use by enterprises, thus precluding the direct application of typical load separation for analysis. A comprehensive consideration of business and data performance is necessary to formulate solutions.
Proposed Solutions for User-Side Energy Storage Identification
To address the issues of data scarcity and complex characteristics in the identification of user-side energy storage, this article proposes a user identification model based on typical sample selection methods enhanced by data augmentation, specifically for commercial and industrial users in peak-valley arbitrage models. The solutions encompass the following four aspects:
- Data Augmentation and Typical Sample Selection: To tackle issues of missing device-level load data, we adopt a strategy combining typical sample selection and gradual iterative optimization to continually expand the energy storage sample database. This approach dynamically adjusts feature rules and thresholds, gradually improving the accuracy of the energy storage load identification model.
- User Clustering Using MiniBatch K-Means: We conduct detailed clustering analysis on the load curves of sample users, identifying typical user groups to guide the construction of an energy storage identification index system and enhance the accuracy and specificity of the analysis.
- Feature Selection and Optimization via Random Forest: Under a unified data source, we utilize random forest models to select and optimize features, extracting key indicators and setting threshold values to establish a comprehensive indicator system that improves model adaptability and generalization capability.
- Multi-Dimensional Feature Cross-Iteration Model Optimization: We construct a multi-dimensional business feature cross-iteration model that continuously optimizes feature combinations and thresholds based on identification results, iteratively enhancing identification precision and applicability for algorithm promotion.
Overall Framework
The overall framework is illustrated in the following figure:
Data Processing Strategy
Due to missing device-level load data, it is challenging to identify energy storage loads through multi-dimensional load decomposition analysis. This study focuses on users with electricity capacities exceeding 630 kW and employs a stepwise optimization process for iterative improvements in the identification model:
- Extraction of Typical Load Users: We extract monthly load curves for users, eliminating data from dates with low load completeness. Missing values are managed using Lagrange interpolation to create high-quality monthly average load curves, excluding holidays and other special dates to enhance analytical accuracy.
- Baseline Month Selection: Using the characteristic of increased electricity consumption during valley periods, we employ a sliding time window approach to identify months with significant load increases as baseline months, ensuring the stability and representativeness of energy storage features during the analysis period.
- Feature Construction: The feature construction focuses on low electricity consumption periods in the early morning, midday, and late evening, considering the impact of solar power generation on load. This generates additional feature indicators and rules to highlight the characteristics of energy storage users.
- Iterative Optimization: Progressing gradually at the municipal level, we verify each batch of identified energy storage users, continually expanding the sample database and optimizing features and rules to enhance model identification accuracy and facilitate ongoing improvements in data processing.
Clustering Analysis
In preliminary load curve analysis for energy storage users, we observed significant differences in user electricity consumption characteristics, alongside issues of limited data volume and indistinct features. To address this, we will adopt a two-step strategy: first, using the MiniBatch K-Means clustering algorithm to refine user segmentation based on 96-point load data, identifying user groups with the most pronounced energy storage characteristics. Secondly, we will systematically compare the data distributions of storage and non-storage users across various indicators to reveal the distinct consumption patterns of each user type.
Feature Selection Using Random Forest Models
Energy storage user feature indicators encompass multi-dimensional data such as user profiles, power fluctuations, and load variances. The number of features is extensive, and setting thresholds presents challenges, making selection and optimization crucial. Following the initial clustering of typical energy storage users, we analyze foundational 96-point load data and its attributes to accurately delineate user consumption data indicators. By integrating classic machine learning classification models like random forests and decision trees, we construct an efficient classification recognition model to extract the degree and manner of influence of each feature indicator, optimizing key indicators and establishing preliminary threshold settings.
Cross-Integration Model Based on Multi-Dimensional Business Features
While the random forest model yields quality features and thresholds, the interrelatedness and complementarity of various features necessitate effective combinations to enhance identification accuracy. This study introduces a cross-integration algorithm to generate new composite features through linear combinations of existing features, followed by cross-validation within the multi-indicator system. By iteratively optimizing based on an increasing user data foundation, we distill the optimal set of rules and uncover latent patterns within the data, ultimately producing a high-accuracy cross-integration rule table.
Conclusion
This article focuses on the precise identification of user-side energy storage systems. By employing clustering algorithms and machine learning classification techniques, we achieve accurate recognition of energy storage users outside the current regulatory framework. Through in-depth analysis of user electricity load curves, we establish an identification model that effectively enhances recognition accuracy, enabling power companies to comprehensively understand the distribution of user-side energy storage. Future work will center on developing models for potential energy storage investment users and revenue estimation. By analyzing existing user characteristics and profit models, we aim to identify high-revenue potential users and provide configuration recommendations to expand the energy storage market.