Machine Learning: A Tool for Predicting Cavitation Erosion Rates on Turbine Runners

The authors present a case study of the use of machine learning to detect cavitation in a hydro turbine. The method employs commonly-available proximity probes and a support vector machine classifier.

By Seth W. Gregg, John P.H. Steele, and Douglas L. Van Bossuyt

Peer Reviewed This article has been evaluated and edited in accordance with reviews conducted by two or more professionals who have relevant expertise. These peer reviewers judge manuscripts for technical accuracy, usefulness, and overall importance within the hydroelectric industry.

Cavitation is one of the primary causes of turbine failure1,2 and is responsible for large annual monetary losses in terms of repair costs and lost generation.3 Restricting a hydro turbine’s operating range outside of known cavitation zones can help mitigate damage, but damage can still occur due to seasonal variations to water levels, flooding or drought, and changes in the way turbines are operated. In addition, overly conservative restrictions can lead to running turbines outside of their optimal efficiency ranges and restrict operational flexibility.

Installing a real-time cavitation detection system is one choice to help a hydro turbine operator understand when cavitation is occurring. There are many choices for sensor-based cavitation detection,4,5,6 but no single method has been shown to be both economical and feasible for every style of hydro turbine and type of cavitation. Thus there is a need to develop better, more accessible methods for sensor-based cavitation detection.

An effective cavitation detection method must be able to track both the presence of cavitation as well as intensity over a long period if it will be used for estimating erosion rates. This is no trivial matter because efforts have been attempted on operating hydro turbines, but long-term cavitation detection, intensity tracking and damage inspection is challenging in an industrial environment and, to our knowledge, erosion rate information has yet to be published.2,7

In this article, we address some of the difficulties of cavitation detection by presenting an automated method that can be used with many different sensor types and is robust to variability of the operating conditions of a real, production hydro turbine. The case study presented demonstrates a cavitation detection methodology we believe is more easily applied, when compared to existing methods, to both condition monitoring and long-term testing for estimating cavitation erosion rates. The methodology is demonstrated using data collected from proximity probes combined with a machine learning algorithm called a support vector machine (SVM), which is used to identify cavitation in an 85-MW Francis turbine.

An SVM is a general category of machine learning algorithm first described in the mid-1990s that has since been used to help detect machine faults for condition monitoring in academic papers. Proximity probes are transmitters that produce a voltage signal proportional to the relative movement between the sensor and the hydro turbine shaft. Proximity probes can detect small movements that are usually measured in mils. The use of proximity probes and SVMs represents a new direction in cavitation detection for three reasons:

  1. Proximity probes (or non-contacting eddy current displacement sensors) typically are not used for cavitation detection because cavitation is a high-frequency event and proximity probes are sensitive to lower frequency ranges. The use of proximity probes is not critical to the analysis shown in this article. Rather, they represent a type of sensor that has typically not been viewed as being responsive enough to cavitation. The use of proximity probe data combined with carefully selected cavitation detection features merely demonstrates how the use of lower-frequency sensors – maybe even ones already installed in a hydro turbine – can be quite effective for cavitation detection.
  2. Ramp-down data (collected while the turbine goes from fully open to fully closed wicket gates) is used to identify when cavitation is occurring and train (or calibrate) the SVM. Ramp-down data can be collected quickly and is unobtrusive, so the SVM can be recalibrated regularly. Using ramp-down data for recalibration is an advantage for long-term cavitation detection as it allows the SVM to be updated when operating conditions change due to seasonal or other sources of variation.
  3. To our knowledge, the use of machine learning algorithms for cavitation detection in hydro turbines has never been published. By using SVMs for cavitation detection, we open the door to a large and constantly evolving body of knowledge that can be used to improve how turbines are operated and maintained.

Case study

The case study presented uses data the U.S. Department of Interior’s Bureau of Reclamation collected during a cavitation survey of a Francis turbine at a hydropower plant in the western U.S. One of the turbines at the plant had undergone a major replacement of its runner and subsequently experienced large amounts of leading edge cavitation requiring major repairs. The turbine was known to still be experiencing cavitation after repairs were made and vibration testing was performed to identify safe zones where the turbine could be operated to reduce cavitation damage.

The vibration test used to identify cavitation relied primarily on temporarily mounted high-frequency accelerometers, acoustic emission sensors, and even a specially designed wireless sensor setup mounted directly to the rotating shaft of the turbine. By contrast, the data used for this case study (which was collected during the same testing period) was from four proximity probes, two mounted 90 degrees apart near the lower turbine bearing (PP1 and PP2) and two mounted 90 degrees apart near the upper turbine bearing (PP3 and PP4). The proximity probe data was originally collected to capture only potential low frequency faults such as unbalance and draft tube swirl.

The following two data sets were used from the cavitation survey:

  • Data collected during a 100-second-long linear ramp-down of the hydro turbine starting at 85 MW and ending at 0 MW; and
  • Data collected during steady-state operation at 17 output conditions ranging from 5 MW to 85 MW in 5 MW increments.
The capacity of a hydro unit decreases quickly during rampdown (left), when data was gathered and analyzed to look for vibration frequency ranges (right) sensitive to cavitation.
The capacity of a hydro unit decreases quickly during rampdown (left), when data was gathered and analyzed to look for vibration frequency ranges (right) sensitive to cavitation.

The case study has two objectives: First, to demonstrate a procedure for selecting a cavitation detection feature and training an SVM using data that can be obtained quickly and with minimal disruption to turbine operation. Second, to test the feasibility of using SVMs and proximity probe data for cavitation detection.

A three-step procedure was followed to obtain a realistic estimate of how well an SVM would work for identifying cavitation:

  1. Feature Selection Selecting a set of variables used by an algorithm to make predictions.The variables originally outlined were the sensor, sensor location and cavitation sensitivity parameter (CSP). The first two variables were fixed to keep the case study simple. The third variable then became the focus of the feature selection step for this study.
  2. Classifier Training SVMs (and other supervised learning algorithms) require training with labeled data before they can make predictions from new unlabeled data. Our training data consisted of previously analyzed data containing both cavitating and non-cavitating conditions. Classifier training and testing was performed using Matlab Software (R2015a) with the Statistics and Machine Learning Toolbox.
  3. Classifier TestingThe SVM’s predication accuracy can be tested using a separate data set where the labels are hidden from the SVM. Test data collected separately from the ramp-down data was fed into the SVM and the algorithm produced a predicted classification for each feature data point. The SVM’s prediction accuracy was calculated using the number of correct classifications divided by the number of feature test points.

Feature selection

The first step toward determining a CSP is to analyze the ramp-down data for vibration frequency ranges sensitive to cavitation. To do this, ramp-down data that had previously been collected from the turbine (Figure 1 shows the turbine power as a function of time) was divided into one-second intervals. The Fast Fourier Transform was applied to each interval, resulting in 100 frequency spectra. The variance of each frequency bin across all 100 spectra was found using Equation 1:


  • M is the total number of spectra,
  • xm is the frequency amplitude, and
  • μx is the mean value of the amplitude over all the spectra.

The result is a single variance spectrum (see Figure 2) showing the variance in amplitude of frequencies from 1 to 100 Hz over the entire ramp-down sequence. This resulted in a normalized spectrum (Equation 2 is normalized by the total number of spectra minus one) where the output compares a relative difference in amplitude of individual frequencies over the whole ramp-down. Analysis of the variance spectrum indicates two frequency ranges of interest: 0 to 20 Hz and 60 to 75 Hz. The normalized amplitude of frequencies above 100 Hz is essentially zero.

To understand how frequency range 1 and 2 relate to cavitation, the root mean square (RMS) amplitude for each range is calculated for each one-second interval of the ramp-down data using Equation 2:


  • x is the amplitude of each vibration sample in the one-second interval, and
  • N is the total number of vibration samples in each interval.

Figure 3 on page 34 shows the results of the RMS ramp-down calculations for frequency range 1 and 2 versus turbine output. The amplitude of frequency range 1 peaks above 80 MW and around 30 MW, while the amplitude of frequency range 2 peaks near 60 MW. The difference in amplitude between the two ranges shows they are tracking different phenomenon within the turbine during the ramp-down. Frequency range 1 is primarily made up of the turbine running speed vibration and its harmonics, while frequency range 2 includes the blade passing and guide vane passing frequencies.

Based on analysis performed outside of the scope of this article, frequency range 1 is tracking vibration caused by draft tube swirl while frequency range 2 is tracking erosive cavitation on the runner blades. Cavitation analysis and runner inspection were performed by Reclamation personnel using techniques similar to those discussed elsewhere.4,5

RMS amplitude of vibration within frequency ranges 1 and 2 are the first two sensitivity parameters chosen for detecting cavitation for this study. RMS amplitude within a frequency range is prevalent as a simple form of cavitation detection5,6,8 and is commonly used in general condition monitoring as well.9

The third choice for a CSP is Kurtosis of vibration in frequency range 1. Kurtosis is chosen because it is a way to measure the impulsiveness of a vibration signal10 and is seen in such condition monitoring applications as bearing and gear fault detection.11 Kurtosis is calculated using the mean and the standard deviation of the N values in each one-second segment of the ramp-down data.

Equation 3:

Table 1 shows the complete features chosen to be tested for cavitation detection using an SVM.

Classifier training

The SVM algorithm used for cavitation detection is trained using the features from Table 1 generated from the turbine ramp-down data. Ramp-down data can be collected relatively quickly and with little effort, allowing regular retraining of the SVM to prevent the algorithm from becoming ineffective due to changes in the parameters discussed previously.

The disadvantage of training the SVM with ramp-down data is that the analyst must know or estimate when the hydro turbine is experiencing cavitation so that the data can be properly labeled.

This chart show the root mean square during ramp-down of proximity probe 1.
This chart show the root mean square during ramp-down of proximity probe 1.

For the case study described in this article, the data is labeled through a combination of knowledge gained from previous cavitation analysis on the hydro turbine and analysis of the coast down plots shown in Figures 3 and 4 on page 34. Feature data collected between the range of 81 MW to 40 MW is labeled as class 1 and the rest of the data collected is labeled as class -1.

Based on the ramp-down data and previous analysis performed on the hydro turbine, the data collected could have been given additional labels that would have been used to train the SVM to also recognize high vibration caused by draft tube swirl. An SVM that recognizes more than two classes is called a multiclass SVM.

Classifier testing

The ability of the trained SVM to recognize cavitation conditions in the hydro turbine is tested using additional proximity probe data collected during the cavitation survey. We believe that training with ramp-down data and testing with steady state running data is a realistic test of the SVM’s prediction capabilities under actual running conditions.

The test data was taken during steady-state running conditions at 17 power output levels. Each power level had 32 feature data points, for a total of 544 test points. The class label for each data point was determined through previous analysis of accelerometer and acoustic emission data taken during the cavitation survey.

Table 2 shows results of the classifier testing for each feature.


As can be seen from feature F_all_2, frequency range 2 is the best single CSP for predicting cavitation. The use of multiple parameters with the correct combination of sensors ultimately produces the best results, as can be seen with the top five ranked features. The additional advantage to including these parameters is that they can be used for detection of draft tube swirl when using a multiclass SVM to detect multiple types of faults.

The features with the overall best performance are F_34_all, F_4_all, and F_24_all, with close to 95% correct classifications. Due to the non-critical nature of detecting single cavitation events and the simplicity of the SVM model used, this is deemed an acceptable score. This score may potentially be improved through the use of soft margin classifiers or non-linear classifiers specifically tailored to the cavitation detection application.

This chart show the root mean square during ramp-down of proximity probe 2.
This chart show the root mean square during ramp-down of proximity probe 2.

Evaluation of the top three sensors also indicates PP4 is the single best sensor to use, while PP3 has the worst single sensor performance, with only 80% correct classifications. Multiple sensors do not necessarily improve the performance of the classifier; however, there is a potential advantage to using multiple sensors for long-term robustness. One way to take advantage of having multiple sensors would be to use two separate SVM classifiers, each making predictions with a different sensor. A double sensor setup such as this would allow a faulty sensor to be detected and provide data for identifying false negatives and false positives.

Although this case study does not directly address collecting cavitation intensity, the internal structure of the SVM includes a way to quantify the distance that each feature data point is from the internal classification boundary (the SVM hyperplane). This distance is called the SVM score and can be used to measure cavitation intensity.


This case study shows how an SVM classifier can be used to identify cavitation in a hydro turbine – the first step toward estimating cavitation erosion rates. The combination of the use of a machine learning algorithm, proximity probes, and ramp-down data will make long-term cavitation data collection easier and more accurate, which will allow cavitation erosion rates to be more easily estimated.


The authors wish to thank John Germann and James DeHaan for collecting the cavitation survey data and their guidance with data analysis. The information, data, or work presented herein was funded in part by the Office of Energy Efficiency and Renewable Energy (EERE), U.S. Department of Energy, under Award Number DE-EE0002668 and the Hydro Research Foundation. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation or favoring by the U.S. Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the U.S. Government or any agency thereof.

Seth Gregg is a former Hydro Research Fellow and is now manufacturing intelligence consultant with Logical Systems LLC. John Steele is associate professor with Colorado School of Mines. Douglas Van Bossuyt is a partner at KTM Research LLC.


1Dorji, U., and R. Ghomashchi, “Hydro Turbine Failure Mechanisms: An Overview,” Eng. Fail. Anal., Vol. 44, 2014, pages 136-147.

2P. Bourdon, P., M. Farhat, Y. Mossoba, and P. Lavigne, “Hydro Turbine Profitability and Cavitation Erosion,” Waterpower’99, HCI Publications, Kansas City, Mo., 1999.

3“The Knowledge Stream – Detecting Cavitation to Protect and Maintain Hydraulic Turbines,” Bureau of Reclamation Research and Development Office, 2014.

4Escaler, X., E. Egusquiza, M. Farhat, and F. Avellan, “Vibration Cavitation Detection using Onboard Measurements,” Fifth International Symposium on Cavitation, 2015, pages 1-7.

5Escaler, X., et al, “Detection of Cavitation in Hydraulic Turbines,” Mech. Syst. Signal Process., Vol. 20, 2006, pages 983 – 1007.

6Cencic, T., M. Hocevar, and B. Sirok, “Study of Erosive Cavitation Detection in Pump Mode of Pump-Storage Hydropower Plant Prototype,” J. Fluids Eng., Vol. 136, No. 5, 2014.

7Francois, L. “Vibratory Detection System of Cavitation Erosion: Historic and Algorithm Validation,” Proceedings of the Eighth International Symposium on Cavitation, 2012, pages 325 -330.

8Schmidt, H., et al, “Influence of the Vibro-acoustic Sensor Position on Cavitation Detection in a Kaplan Turbine,” IOP Conf. Ser. Earth Environ. Sci., Vol. 22, No. 5, 2014.

9Randall, R.B., Vibration-based Condition Monitoring: Industrial, Aerospace and Automotive Applications, 2010.

10Pachaud, C., R. Salvetat, and C. Fray, “Crest Factor and Kurtosis Contributions to Identify Defects Inducing Periodical Impulsive Forces,” Mech. Syst. Signal Process., Vol. 11, No. 6, 1997, pages 903-916.

11Chen, H., Y. Shang, and K. Sun, “Multiple Fault Condition Recognition of Gearbox with Sequential Hypothesis Test,” Mech. Syst. Signal Process., Vol. 40, No. 2, 2013, pages 469-482.

Previous articleHydropower Generation Performance Testing at Plants in Thailand and Laos
Next articleTechnology Allows Hydropower Construction at Inaccessible Sites

No posts to display