Public defence Norhakim Yusof

Exploring wind dynamics in space and time - mining frequent patterns and modeling power potential 

Norhakim Yusof is PhD Student in the Department of Geo-Information Processing 

Wind is a dynamic geographic phenomenon highly variable over space and time. Wind data is becoming more and more available, typically as time series of wind speed and direction measured at fixed locations and at one or more heights. Despite this increasing availability, designing and applying effective information extraction methods to wind data remains challenging. Modern data mining approaches are promising because they identify wind patterns. Studying and mapping these patterns hold great potential for decision-makers in many application domains such as finding appropriate locations for wind farms. This PhD research focuses on the analysis of wind data to obtain a better understanding of wind behaviour and of its potential to generate power.

Despite previous wind data mining and wind power modelling studies, several challenges persist in the exploration of this complex spatio-temporal data. For instance, (1) mining and exploring of spatio-temporal patterns for wind speed and direction simultaneously; (2) mining wind patterns without discretizing time; (3) exploring wind speed and direction patterns across multiple heights; and (4) developing data-driven  models to  predict  wind power potential at a national scale. This PhD thesis addresses these four challenges.

First, we developed a data mining-based approach to explore spatio- temporal wind patterns. Sequential pattern mining (SPM) was used to mine recurrent wind speed and direction patterns at the same time. These patterns characterize wind behaviour over large areas and long periods of time. Yet, the sole application of SPM leads to information that is hard to understand and interpret. To overcome this, we used various geovisualization techniques. In a case study, we coupled LCMSeq, a specific SPM algorithm, with TileVis and a 3D wind rose. Using 10 years of Dutch hourly wind data, this visual mining approach was used to discover daily wind patterns at weather stations. The resulting patterns can be grouped into five unique wind sequential patterns that describe the space-time varying behaviour of wind dynamics at a daily time scale. In particular, Dutch wind behaviour is mostly characterized by wind that blows from the south-southwest with speeds ranging from 5 to 8 m/s. Our approach demonstrates that the combination of an algorithmic and of a visual strategy can be used to explore and mine frequent wind patterns in an interactive fashion.

Then, to be able to perform a more exhaustive exploration of wind patterns, we introduced the use of a time sliding window (SW). The SW is a temporal kernel that allows the analysis of wind patterns from continuous time series instead of having to split the data according to arbitrary temporal units (i.e. daily). This SW-based data mining approach showed that the Dutch weather station data contained three times more patterns than found in the previous study. The results showed that Dutch wind behaviour  is characterized by two prevailing wind directions. One that blows from the south-southwest (as found in the previous study) and another one that blows from the northeast-east. These findings confirm that there is a risk of missing patterns if one neglects the effect of discretizing time when mining temporal data. Hence, using continuous time approaches is more efficient, and eliminates the need to discretize the data into arbitrary temporal units. Furthermore, this time we used multiple linked views where we combined the use of TileVis, a 3D wind rose in the space-time cube and a 2D map to interactively visualize and explore the mined patterns. This interactivity offers an improved visual representation of the results and enables users to explore the underlying multi-dimensional wind patterns in space and time. Moreover, we used the concept of ‘Focus + Context’ to control the levels of details in each of the views so that we could prevent visual cluttering while  still providing an overview of the wind patterns.

After that, we developed a data mining approach to enhance the exploration of complex wind behaviour. In particular, this time we considered the dynamics of wind over space and time, where space includes height as third dimension. Like this, the mined patterns become wind profile patterns. The standard SPM approach was modified to mine patterns from a set of multi-dimensional sequences consisting of wind attributes (speed/direction) along with their locations, time of measurement and heights. This created a multi-dimensional sequential pattern mining (MDSPM) approach to map frequent wind profile patterns. This approach was applied to 24-year of Medium-Range Weather Forecasts European Reanalysis-Interim gridded data (1990-2013 at 0.125° resolution for 6 different heights) for the Netherlands. The mined wind profile patterns were visualized using a 3D wind rose, a circular histogram and a 2D map. These three visualizations support a holistic understanding of wind behaviour in the study area. Results identified four frequent wind profile patterns, that were further analysed to determine their wind shear coefficients and turbulence intensities as well as their spatial overlap with current areas with wind turbines. The results found that one of the wind profile patterns is highly suitable to harvest wind energy at a height of 128 m and approximately 69% of the geographical area covered by this pattern already contains wind turbines. These results show that the proposed mining approach enables an enhanced exploration of wind behaviour.

Finally, we proposed a novel data-driven modelling approach to map annual wind power potential. Our approach is based on state-of-the-art machine learning methods, which were used to create two models, namely a Full Profile model and a Selective Height model. The proposed models were designed by integrating environmental variables (24 years of wind speed and direction at 2.5 km resolution and landcover information at 25 m resolution) and wind turbine design features (swept area) with wind national-scale wind turbine production data. The Full Profile model considers the variability of wind speed and direction across multiple heights (6 heights), whereas the Selective Height focuses on the wind speed and direction at the height of the wind turbines. A key characteristic of these models is that they predict wind power normalized by the swept area and this allows the models to be trained by combining data from multiple wind turbine types. This characteristic also enables the models to predict and map  wind power potential for small, medium and large wind turbines. Regarding machine learning, we tested two ensemble regression methods: Gradient Boosting Regression (GBR) and Random Forest Regression (RFR). Both methods provide the importance of the environmental variables used to model wind power. Results showed that the Full Profile with GBR model is the best one with a normalized RMSE of 0.15. This model also shows a strong correlation between the measured power and the predicted wind power for all WT categories. The variables deemed important by the Full Profile_GBR model show that wind power is mostly influenced by the wind speed and direction at 40, 60 and 200 m especially in the months of January, March, July, November and December. Furthermore, we noted that existing wind turbines are mostly located in agricultural areas. This explains the relatively high importance of the agricultural-based feature when predicting wind power potential. Besides this, the selection of the full profile model reveals that wind power is sensitive to the changeability of wind behaviour at multiple heights.

In a nutshell, this thesis presents the combination of data mining and geovisualization for the exploration of multi-dimensional wind patterns so that we can get a better understanding of wind behaviour at a large scale. Depending on the patterns to be mined, our data mining approaches can be characterised as 2D (location and time) or 3D (location, time and height). The use of geovisualization enables a deeper exploration of complex wind patterns and by making them interactive we can improve the data mining approaches. This thesis has also demonstrated the potential of data-driven approaches for developing robust wind power prediction models. We hope that these findings contribute to a better understanding of wind (power) patterns and support future wind farm development. Since our approaches remain generic, we also hope that our findings will support the extraction of useful information from other dynamic spatio-temporal datasets.