(CoDAlab & IMTech). Received 21 April, 2022.
The world energy system is undoubtedly in transition. The widespread adoption and use of renewable energy are the fastest and cheapest route to greater energy independence after the Ukrainian invasion and are the key to fight climate change. The European Commission’s scenarios show renewables-based electrification will be central to delivering climate neutrality in Europe by 2050. To achieve this goal, wind energy is a crucial component, as it is required to be 50% of the European Union’s electricity mix with renewables representing 81%. However, the crux of the matter in the advancement of the wind industry is the reduction in its levelized cost of energy (LCOE). The LCOE of a wind farm is determined by combining different factors, being operation and maintenance costs a significant part (20-25% in onshore wind farms and 25-30% in offshore ones). Therefore, a key factor in the achievement of low-cost wind energy is the optimization of maintenance strategies.
Energy production losses due to downtime (caused by unplanned reparation of the assets) together with the costs associated to the replacement of components can scale up to millions of euros per year in any industrial-size wind farm. Thus, it is of paramount importance that the wind industry moves from corrective (repairing components after they break down) and preventive maintenance (scheduled at regular intervals without taking into consideration the actual condition of the asset) to the so-called predictive maintenance (scheduled as needed based on the asset condition). Predictive maintenance is based on actual and timely information collected by monitoring the actual asset through a network of sensors (performed using high-frequency data of physical quantities as vibration, temperature, oil analysis, and acoustic emissions) and provides operators with an advanced warning before the actual fatal fault occurs, thus allowing them to plan ahead and schedule repair to coincide with weather or production windows to reduce costs and turbine downtime. In other words, as shown in the graphics below, predictive maintenance activities are only carried out when they are actually required, that is, the item is not unnecessarily over-maintained, but it will also not fail unexpectedly, thus leading to an optimized cost of maintenance actions. Digitalization and artificial intelligence are key technologies to this strategy, to better exploit the information in the large amount of data (gathered by continuous or periodic, online or offline) from different sensors acquired from the assets. The general framework is to detect changes in the condition that represent deviations from normal operation and indicate a developing fault.

Comparison of corrective, preventive, and predictive maintenance
over time with respect to the condition of the asset (left), and com-
parison of the number of failures and associated costs (right).
Within this framework, the CoDAlab research group has a research line related to the development of advanced predictive maintenance strategies based on artificial intelligence (AI) for the early prediction of failures in wind turbines. The methodologies are validated on real data from 150 wind turbines from different wind farms. This is a crucial resource, and, in this regard, the support and data ceded from the Smartive company is really appreciated, [1].
Predictive maintenance is a wide-ranging area of research and has been successfully deployed to a wide variety of applications. However, its application to complex systems such as wind turbines that are megastructures working under different and varying operating and environmental conditions and in harsh environments (such as offshore) remains an open challenge. Furthermore, the latest developments tend to use expensive specifically tailored sensors, which is not economically viable for turbines already in operation and even less in case they are close to reach the end of their lifespan. This is relevant, as it is expected that 38 GW of wind farms in Europe will reach their life expectancy in the next five years. Based on current trends, it is estimated that about 2.4 GW will be decommissioned for repowering and 7 GW will be fully decommissioned. The remaining 29 GW will continue to operate and will be considered for life-time extension services. In this context, data-driven predictive maintenance methodologies based on existing supervisory control and data acquisition (SCADA) data (available in all industrial-sized wind turbines) are a promising low-cost solution.
As the initial design of SCADA data was for operation and control purposes only, their use for predictive maintenance is a great challenge. SCADA data contain about 200 different variables (that is, it is high dimensional), with a very low sampling rate (it is recorded continuously at 10-minute averaged intervals to reduce transmitted data bandwidth and storage), they depend on the operational region of the WT, as well as on the environmental condition, and they are time series with a strong seasonality. Furthermore, when SCADA systems were built, the value of maintaining standardized maintenance work order logs with detailed fault descriptions was not known (as it was not envisioned that AI could help in this application). On top of this, most of the available data are from normal operation, making them highly unbalanced data sets. Despite these difficulties, lately, the topic of using SCADA data for predictive maintenance purposes has gained increased attention. However, many challenges are still to be addressed in current and future research. The next paragraph highlights the five main challenges in this research area.
First, a significant percentage of publications are based on supervised algorithms (classification methods). Despite their promising performance, in a real application their use is almost precluded as they require historical labeled faulty and healthy data to be constructed. This is an important drawback, as obtaining labeled datasets from wind turbine operational data is typically hard (as maintenance records are not standardized), time-consuming, it is exposed to errors, and leads to a highly unbalanced dataset. Additionally, the methodology cannot be applied straightforward to wind farms where the fault of interest did not occur in the past. Second, a significant number of references use simulated SCADA data or experimental data (from a test bench) to validate the results. Although it is understandable, as real SCADA data sets are often proprietary and are not easily available by the scientific community, it is an important drawback as relying on synthetically generated data may not generalize well to actual real-world conditions. Third, the majority of the literature base their results on a relatively small amount of data, usually only 1 to 4 wind turbines. Thus, it is not clear whether these strategies will generalize well to the whole wind farm. Fourth, some references contribute strategies that lead to a high number of false alarms, thus making the contribution not convenient in the real application as it would end up creating alarm fatigue for operators. Fifth, a non-negligible number of papers detect the fault with less than a week in advance, thus not being helpful in a real application, where the plant operator needs at least months to plan ahead and schedule repair to coincide with availability of replacement parts as well as weather or production windows to reduce turbine downtime.
On this basis, two contributions to highlight from the CoDAlab research group are the following. In [2] a main bearing prognosis strategy is proposed based on a normal behavior model (unsupervised) using an artificial neural network (ANN) with Bayesian regularization. This means that the ANN weights are considered as random variables and their density functions are updated according to Bayes’ rule. Furthermore, the Levenberg-Marquardt optimization method is adapted to provide a mathematical reasoning of the hyperparameter setup. Finally, the methodology is validated in a wind farm consisting of 12 wind turbines. In [3] an ensemble learning that combines an ANN (normality model designed at turbine level) and an isolation forest (anomaly detection model designed at wind farm level) algorithms for predictive maintenance of the main bearing is proposed and validated on a wind farm with 18 wind turbines. The normal behavior and anomalous samples of the wind turbines are identified, and several interpretable indicators are proposed based on the predictions of these algorithms. Finally, note that these two references contribute predictive maintenance strategies based solely on wind turbine SCADA data that address the aforementioned five main challenges: i) Entirely unsupervised (not requiring the labeling of data through work order logs) based only on healthy data, thus expanding its range of application to any wind farm (even when no faulty data has been recorded yet). ii) Validated on real (not simulated or experimental) SCADA data. iii) Validated at wind farm level. iv) Reliable predictions with minimum false alarms thanks to specially designed fault prognosis indicators. v) Early warning months in advance.
Finally, near future work is devoted to wind turbine digital twins (DT). A DT is an up-to-date representation, a model, of
an actual physical asset in operation. It can be a model of a component (gearbox, generator), and it reflects the current asset condition and includes relevant historical data about the asset. DTs can be physics-based (from first principles involving mechanical, hydraulic, and electrical components), data-driven (for example by artificial intelligence or statistical approaches), or a combination of both. The models reflect the operating asset’s current environment, age, and configuration, which typically involves direct streaming of asset data into tuning algorithms that can use artificial intelligence methodologies. Once the digital twin is available and up-to-date, it can be used in different ways to predict future behavior. Notable examples are to include sensors in the DT that are not present on the real asset, simulating future scenarios to inform current and future operations, or using the digital twin to extract current operational state by sending in current real inputs. Therefore, as previously indicated, immediate future work is related to the development of wind turbine DTs for prognostics and health management.
References
[1] Development of a cloud SCADA solution for the diagnosis and control of wind turbines, CDTI (Centre for the Development of Industrial Technology) university-company collaboration contract, Smartive–UPC, grant number CDTI-IDI 20191294, 2020-2022.
[2] Á. Encalada-Dávila, B. Puruncajas, C. Tutivén, Y. Vidal: Wind turbine main bearing fault prognosis based solely on scada
data, Sensors 21 (2021) 2228.
[3] M. Beretta, Y. Vidal, J. Sepulveda, O. Porro, J. Cusidó: Improved ensemble learning for wind turbine main bearing fault diagnosis, Applied Sciences 11 (2021) 7523.