Since 1997, KDD Cup has been the premier annual Data Mining competition held in conjunction with the ACM SIGKDD Conference on Knowledge Discovery and Data Mining. This year’s KDD Cup challenge task presents interesting technical challenges and has practical importance for the utilization of wind energy. Here we propose a spatial dynamic wind power forecasting challenge to facilitate the progress of data-driven machine learning methods for wind power forecasting.
Wind Power Forecasting (WPF) aims to accurately estimate the wind power supply of a wind farm at different time scales. Wind power is a kind of clean and safe source of renewable energy, but cannot be produced consistently, leading to high variability. Such variability can present substantial challenges to incorporating wind power into a grid system. To maintain the balance between electricity generation and consumption, the fluctuation of wind power requires power substitution from other sources that might not be available at short notice (for example, usually it takes at least 6 hours to fire up a coal plant). Thus, WPF has been widely recognized as one of the most critical issues in wind power integration and operation. There has been an explosion of studies on wind power forecasting problems appearing in the data mining and machine learning community. Nevertheless, how to well handle the WPF problem is still challenging, since high prediction accuracy is always demanded to ensure grid stability and security of supply.
We present a unique Spatial Dynamic Wind Power Forecasting dataset from Longyuan Power Group Corp. Ltd: SDWPF, which includes the spatial distribution of wind turbines, as well as the dynamic context factors like temporal, weather, and turbine internal status. Whereas, most of the existing datasets and competitions treat WPF as a time series prediction problem without knowing the locations and context information of wind turbines.
An illustration of the SDWPF dataset is shown below. Each wind turbine can generate the wind power Ti separately, and the outcome power of the wind farm is the sum of all the wind turbines. In other words, at time t, the output power of the wind farm is P=∑_i Patv_i .
There are two unique features for this competition task different from previous WPF competition settings:- Spatial distribution: this competition will provide the relative location of all wind turbines given a wind farm for modeling the spatial correlation among wind turbines.
- Dynamic context: important weather situations and turbine internal contexts monitored by each wind turbine are provided to facilitate the forecasting task.
All the deadlines are at 23:59 AOE.
- March 16, Registration site open.
- March 20, Initial data released. Participants will practice with the initial WPF data to get familiar with the problem.
- April 10, Full data released. We will release all the datasets and baseline code.
- May 10, Submission start. All teams can try the demonstration submission to ensure a smooth final test submission.
- June 20, Test data update. A new test set will be released for the test prediction.
- **June 21, Team Freeze Deadline. All team members should be confirmed. **
- July 15, Final submission deadline. Each team submits its final prediction model. The models will be evaluated on a private test set to determine the candidate awardee teams.
- July 18, Winner notification. Private notifications and instructions about the code& technical paper are sent to the awardees.
- July 21, Code and the technical paper submission deadline for the awardees.
- July 22, Winners Announcement.
- August 1, Techincal paper revision deadline.
- August 15, KDD Cup Workshop.