Comparative Analysis of Fault Classification Algorithms for Triplex Pumps

Main Article Content

Zakwan Skaf

Abstract

Maintenance strategies have undergone significant evolution during the last thirty years, leveraging advancements in digital twin modeling, sensor technology, communication, augmented reality, artificial intelligence, and predictive analytics. Fault detection and isolation (FDI) within complex systems like triplex pumps have emerged as critical components for effective maintenance planning. Thus, monitoring the triplex pump is crucial to managing faults and avoiding unscheduled maintenance. Feature extraction and selection are pivotal for optimizing fault diagnosis algorithms. This paper aims to present a comparison study of fault classification algorithms based on data collected from the simulation model of a triplex pump under different failure scenarios. The features are extracted from the pump's flow signal and grouped into four sets of features. The first set includes all the extracted features from the signal. These features are a combination of time domain and frequency domain features. The second set includes only the time domain features. The third set of features includes the frequency domain. The fourth set includes the peak magnitude in the power spectrum and the mean value of the flow signal, which are the features that rank highest in both the second and third sets based on Chi2 algorithms. Fourteen classification algorithms are trained, validated, and tested using four feature sets based on the simulation data. The simulation provides data for seven operation scenarios, including healthy conditions with free fault, three single failures, and three combined failures. The performance of the classification algorithms is evaluated using the recall, precision, accuracy, and the F1 score. The results indicate that the Weighted KNN and Bagged Trees Ensemble algorithms achieve perfect accuracy (100%) across all feature sets, indicating their robustness and effectiveness in classification tasks. However, some algorithms exhibit variable performance depending on the feature set used. For example, the Efficient Linear SVM algorithm shows a significant decrease in accuracy when utilizing the 14-feature set or 5 frequency-domain features compared to others, suggesting a potential mismatch for time-domain feature spaces. In addition, the performance metrics, including accuracy, precision, recall, and F1 score, across models showed a remarkable variation between them. Weighted KNN, Bagged Trees Ensemble, and Neural Network models were found to be exceptional, with all perfect scores indicating that they can accurately classify instances according to these metrics.

Article Details

Section
Articles