Risk-based Maintenance Strategies for Offshore Wind Energy Assets

This paper documents the development of an integrated approach for the development of effective maintenance strategies for offshore wind energy assets, based on a risk-based approach. The method starts from a high fidelity risk assessment of each category of critical components, leading to a shortlist of critical risks which are then associated with relevant failure mechanisms which can be studied further, so as to qualify appropriate features to be monitored through assigned inspection and monitoring practices. Then, relevant maintenance approaches can be assigned to each of the critical risks (including corrective, planned and condition-based preventive maintenance), following a structured decision tree. This approach can support decisions for a selected strategy towards optimizing operational management and reducing OPEX costs.


INTRODUCTION
Offshore wind energy technology has advanced considerably with a number of assets currently deployed and the industry moving in new deployment locations, which include China and the US. Analysis of life cycle costs of existing installations has shown that operation and maintenance costs account for more than 30% of the total costs yielding for a requirement to study maintenance related decisions and qualifying advanced methods that can reduce these costs [1], [2]. Reliability-centered maintenance (RCM) has been successfully applied in multiple industries avoiding underutilization of components and leading to increased availability and reduced maintenance costs. In offshore wind farms, distance from shore increases downtime due to accessibility of maintenance personnel to the asset, hence there is a need to investigate and further optimize maintenance strategies through appropriate methods such as RCM.
In this paper we develop and apply a systematic risk-based approach towards a systematic maintenance strategy for offshore wind energy assets. The framework developed includes the establishment of a risk policy which prescribes the approach for criticality assessment, including the relevant risk criteria considered (likelihood of occurrence, beta factor and consequence assessment). In the absence of a standardized consequence and likelihood classification specifically for the offshore wind industry, relevant standards are reviewed and adjusted based on expert elicitation. The analysis of failure modes include identification of risk causes and effects and for critical risks the failure mechanisms which will advise the most appropriate monitoring methods. Next, a generic decision tree is developed which is applicable to the various subsystems in order to suggest one of the potential maintenance strategies (corrective maintenance, planned preventive maintenance, condition-based preventive maintenance, design modifications).
The developed approach is applied in a hypothetical wind energy asset, and more particularly two of its critical components, documenting relevant results in a processed way, qualifying the maintenance significant components and the best approach to maintain them. The framework can be further extended in order to consider new monitoring technologies, developing a multi-criteria approach for their adoption in practice.
This work is part of the European Union's Horizon 2020 research and innovation program called "Romeo Project" (https://www.romeoproject.eu/) with participation of 11 industrial and 1 academic partners.

MAINTENANCE FUNDAMENTALS
Offshore wind energy assets involve complex engineering systems, which are subject to harsh environmental and operational conditions. They are typically designed for 20-25 years of operation, often with a view to extend their service life beyond this period, through evaluation of their integrity status throughout the years [3]. The uncertainty related to these loads as well as the relatively new technology that such assets involve, expose them to a number of risks that operators and developers should control in order to achieve the objectives of an investment. Detailed studies have been presented in literature related to the identification and potential treatment of such risks [4], [5].
In order for the assets to maintain their operational status, they undertake a number of preventive and corrective activities, including inspections at predetermined periods or following extreme events, which can be followed by corrective or planned preventive actions. Inspections can be assisted, where possible, by monitoring arrangements which offer a higher resolution data collection for appropriate features of the system [6], [7]. This approach, in a simple format, is documented by various standards such as [8], while an extended version is included here in Figure 1. As will be discussed later on, this structure can become significantly expanded, including more options that operators can introduce in their practice. In order to achieve the aim of maximizing profitability of assets, the optimal combination of these activities should be determined, considering that corrective maintenance is often very costly, while introduction of more regular than required inspections or monitoring schemes, also increases the costs of operation and maintenance.

Figure 1: Maintenance options
Corrective maintenance refers to taking action for a certain component after a failure has been recorded (by failure the loss of operational capacity is implied) [9]. This practice is often chosen for components the integrity status of which cannot be accurately evaluated in cost effective ways or when they are considered non-significant. Maintenance significant items are either the functional significant items (items for which a functional failure has considerable consequences for at least one of the relevant consequence classes) or the maintenance cost significant items (items with high failure rate, high repair cost, low maintainability, long lead time for spare parts, or that require external maintenance personnel). As this approach assumes that for a maintenance activity to be triggered a failure should occur, it often leads to high costs due to the increased downtime that the asset has to suffer. To this end, preventive approaches are often employed, which are either time-based or condition based, aiming to reduce downtime through a better forecasting of when a failure is expected to occur. This approach, involves an additional cost for proactive measures, which should be counter balanced with the potential cost of the consequence in the case that failure occurred unexpectedly. The selection of the best approach for each failure mode and for each critical component, is a core task of reliability-centered maintenance methods which is a field of practice which is currently attracting significant attention.

Risk Assessment
In order for a cost effective strategy to be developed, resources should be allocated in a way where critical components should attract more attention and efforts. To this end, the first step of the framework would be to perform a structured risk assessment, benchmarking the performance of components against applicable failure modes/risks. To this end, a structured approach has been developed, as illustrated in figure 2 and is further elaborated in [5].

Figure 2: Risk assessment framework
The process starts from the selection of the critical component of the system which will be studied. Next, its function will be defined so as to determine what would constitute a functional failure. For each component, the applicable failure modes should be established, describing the event which causes a functional failure (ie crack and loss of stability etc). For each failure mode, one or more failure causes will be determined leading to the estimation of the relevant likelihood. Failure causes can be linked to design, fabrication, installation, operation, etc. For the calculation of the criticality assessment, a 3-parameter approach has been chosen here in the absence of an approach specific to the wind industry. The determination of the limits between different classes for each parameter, are presented in the next subsection, constituting the basis for a risk policy. The failure end effect should be determined next, describing what will happen once a failure mode occurs. Existing controls in place such as standardprescribed inspections, existing monitoring or redundancy of the system are relevant here leading to the determination of the beta factor. This factor should be higher for failures with limited prior warning before they occur. Next step is to determine the criticality of each failure mode, often taking into account a number of criteria, including cost of intervention, contribution to downtime, any safety implications that may incur, or impact to the environment. Depending on the corporate risk policy of an organization, specific criteria could carry a higher weight, ie safety, hence they can determine the calculation of the criticality number. This number, which will benchmark the importance of each failure mode, can be calculated based on various equations. For this work, we have assumed the following expression: where, CN is the criticality number, L is the likelihood, ߚ is the beta factor (representing the conditional probability of the failure end effect to materialize, given that the failure mode has occurred) and C the consequence category. Once criticality has been determined for all failure modes, the most critical ones will then be further analyzed, discussing the failure mechanism that relates to the physics of the failure. This step which can consider material degradation, excessive loading etc, is important as it can determine how the failure mode can be monitored through appropriate features. Having obtained this, we can select the appropriate maintenance strategy through a structured decision tree, as will be discussed in the relevant section. Not expected

Development of a Risk Policy
The occurrence of the described failure mode is not expected throughout the planned lifetime of the asset and under consideration of the current inspection or maintenance regime.

Possible
The failure mode could occur throughout the planned asset lifetime but it is not certain.

High
This failure mode would, under the given inspection and maintenance regime, certainly occur.

Low
The described failure end effect will most likely not materialize if this failure mode occurs. There are several mitigation or detection measures in place which will prevent the fault to progress to the worstcase effect. 1

Medium
The failure mode could progress to the described failure end effect, but it is not certain. In most cases, the end effect will not materialize.
2 High The failure mode described will certainly lead to the described failure end effect.
3 Development of a risk policy is a critical step in criticality assessment, as it should reflect organizational objectives together with technical specificities of the component and system in study. Mature industries such as the automotive or offshore Oil&Gas have developed widely applied risk policies while this is not the case for the offshore wind industry. To this end, and in consultation with stakeholders across the supply chain, a specific policy has been developed leading to the estimation of equation (1), involving 3 parameters with values specified as shown in Tables 1-3.
Equation (1) that was presented above, considers one parameter for likelihood, one for the beta factor and a sum of five parameters for the consequence category, illustrating that consequence carries a different weight from the other two. The minimum value that the product can take is 5 (1*1*5) while the maximum value is 135 (3*3*15). This can divide the domain of possible values in three categories, as shown in Table 4 below. The high and medium evaluation categories should be considered critical for the corresponding failure modes.

Maintenance decision tree
As mentioned earlier, an effective decision tree should optimize the ratio between preventive and corrective maintenance activities towards maximizing the life cycle profitability of the asset. Similar to the risk policy, maintenance decision trees can be found in standards and guidance documents. The basis of the tree should list all of the maintenance options that are applicable for a given application, and in the higher levels, a number of questions/steps can develop the different decisions paths. Figure 3 presents a decision tree that has been developed in this project and accumulates experience from different industry stakeholders.

Figure 3: Typical decision tree
The first step of the decision tree is to distinguish between critical and non-critical failure modes. The threshold set, based on the risk policy established above, is 43 so all failure modes below this limit can be treated through corrective maintenance (run-to-failure) as they are considered of low importance and preventive maintenance would involve unnecessary cost. The second question asks if the conditions of the item can be measured. If this is the case, a feasibility check on whether determining the condition of the item can be technically or economically possible is performed and in the positive case, condition based preventive maintenance qualifies. As per figure  1, this can be based on periodic inspection or continuous monitoring. If the conditions cannot be measured or it is not feasible (technically or economically) to be measured, the feasibility of predetermined maintenance is explored and if this check is positive, planed preventive maintenance qualifies. This can be calendar-based, ie every two years of operation, or operational cycle-based, ie after a certain number of loading cycles. In case the answer is negative, compliance with the organisational risk policy is checked, in order to evaluate if certain risks can be accepted. This can happen in case that appropriate warranties or insurance policy is in place, or if an operator is willing to accept certain risks, taking the responsibility to restore the asset in case a failure occurs. If this is not acceptable for a particular case, certain improvements in the design of the component or operational process should be introduced. In fact, this is an important element which illustrates that the whole process should be initiated in the design process of a system so that most of the benefits of this structured approach can be harvested. For example, if the process is initiated at a later stage, certain features may not be able to be measured as it may become prohibitively difficult for appropriate measurement configurations to be installed.

IMPLEMENTATION/CASE STUDY
In order to illustrate applicability of the developed framework, two case studies are presented here, one for the blades and one for support structures. After detailed risk assessment for each of the two components, the different steps presented in section 3 are presented towards selection of the maintenance strategy.
The support structure is a key component of the wind turbine, especially considering that in the most common format, that of the monopile, there is minimum allowance for maintenance (as most sub-components cannot be changed) and most maintenance-related activities refer to periodic inspections. A detailed risk assessment, across a number of workshops, has identified a number of failure modes across the main subsystems of this component which include primary and secondary steelwork, corrosion protection system and miscellaneous subsystems [10], [11].
A typical example presented here is that of the loosening of bolts of the monopile-transition piece bolted connections (primary steel work). Loosening of the bolt (failure mode) can be attributed to insufficient fabrication/installation due to installation error, or operation and maintenance due to overtightening of bolts (failure cause). Each of the two cases will lead to different CNs so the first cause will be presented here. The likelihood of occurrence is considered possible, while the failure end effect is potentially the loss of connection and/or the collapse of the tower. Although this is a monopile structure, there is some degree of redundancy in bolts so the beta factor is set to medium. With respect to consequence categories, safety is set to marginal, due to limited exposure time of personnel being on the asset, effect to the environment is critical as the asset will be lost, impact to availability is critical, the spare part cost is marginal, while the cost of intervention is critical.
Considering the risk policy illustrated in section 3.2, the calculated criticality number is found to be 44 corresponding to a medium criticality. The failure mechanism is related to mechanical cause and more specifically to the ULS requirements consideration.

Figure 4: Decision tree implementation example -Item: MP-TP bolted connections, Failure mode: Fatigue cracks
The next step in the process is the employment of the decision tree, as shown in the next figure, where the sequence of the answers are highlighted. The condition of the item can be measured, condition-based maintenance however is not economically feasible. Predetermined maintenance is technically and economically feasible, so bolts' torque should be restored in pre-determined intervals, ie every 5 years ( Figure  4). Tension monitoring through tension indicators to ensure correct tension after installation is a common monitoring practice.
For the second example, a failure mode of the blade component is presented. A typical blade can be divided into a number of sub-systems including internal and external surface, peripherals and the blade body [12]. Here, the leading edge protection (LEP) which is part of the external surface of the blade has the function of protection of blade LE against rain erosion. The failure mode of erosion of LEP can be attributed to design causes as rain and other precipitation, which will over time cause erosion to the blade LE. Based on operational experience the likelihood of this occurring is high. The failure end effect is that the LEP will erode and the laminate will be exposed leading to structural damage and significant performance loss. The beta factor of this occurring is high as it will develop rapidly between normal inspection intervals. Impact to safety and the environment is marginal, availability is medium, cost of spare part and intervention is marginal.
This assessment leads to a criticality number of 54 which corresponds to a medium criticality level. Since the criticality number is above threshold, the next step in the decision tree asks about the possibility to measure the condition of the component. This is possible and so is condition-based maintenance (both technically and economically) so the suggested strategy involves condition-based maintenance, ie through visual inspection every 5 years. This involves the inspection of the LEP and when degradation has exceeded a certain threshold, maintenance should take place ( Figure 5).

DISCUSSION AND CONCLUSIONS
In this work we have developed a structured framework for the selection of optimal maintenance strategies for offshore wind turbines. Basis of the approach is a detailed risk assessment, which will drive resources towards the most maintenance significant components. It should be noted here that the accuracy of the assessment relies in the risk policy of the organisation as well as the competence of attendees in the risk assessment workshops [13], which should reflect the objectives of the organisation and ensure, to the extent possible, that risk assessment information and criticality numbers can be potentially comparable between components. Workshops should ensure that different angles on the design and operation of certain components are well captured. This implies that not only designers or operational management practitioners should be involved, but also roles related to data collection and management, and representatives of financial and logistics functions. It is also important to complete the worksheets in an exhaustive way so that all failure modes are captured. It is good practice for the failure causes to be standardised, while it is important to adopt terminology in the right way and avoid confusion between failure modes, causes and mechanisms. The decision tree that has been presented here, is deliberately kept simple so as to ensure applicability across different sub-systems. A more detailed decision tree could consider more options in the bottom level, including more inspection methods and maintenance activities as well as impact of dependent failures. Drawback of such a more analytical approach is on the fact that it will require more effort to complete, considering that a detailed assessment would reveal a great number of failure modes. To this end, the boundaries of the analysis should be carefully selected in order to avoid requirement for excessive resources in completing the exercise, which is often a barrier towards further implementation.
Application of the method in two cases, one for a blade and one for the support structure, illustrate that it can qualify different maintenance strategies for different failure modes. It should be noted, that the qualified maintenance strategies should periodically be monitored, ensuring that the initial strategy is effective; this is a key requirement of reliabilitycentred methods.