A Novel Ensemble Machine Learning Model for Oil Production Prediction with Two-Stage Data Preprocessing
Abstract
:1. Introduction
2. Methodology
2.1. Random Forest Algorithm
- (a)
- Creation of root nodes: Initially, all data samples are placed in the root node. Subsequently, each feature is examined, and the optimal feature is identified to split the data samples, resulting in the creation of multiple subsets. Feature evaluation methods, such as information gain, information gain rate, and Gini index, are employed for this purpose.
- (b)
- Creation of leaf nodes: The datasets divided by the optimal feature are placed in the leaf nodes.
- (c)
- Segmentation of leaf nodes: For each sub-dataset, the feature set at that point consists of the remaining features after removing the optimal feature. The process continues by traversing all features, and the best feature is selected to further split the sub-dataset, forming a new subset.
- (d)
- The construction of the decision tree model: Steps (a) to (c) are repeated until the predefined conditions for stopping the split are met. Typically, these stopping conditions include criteria such as the attainment of a specified number of leaf nodes that satisfy certain conditions, the utilization of all features for data division, and so on.
2.2. CEEMDAN Algorithm and Principle of Component Recombination
- (1)
- CEEMDAN algorithm
- (2)
- Principle of component recombination
2.3. Temporal Convolutional Network
- (1)
- Causal convolution ensures that the output at a given time is solely dependent on the current time and historical inputs, devoid of any influence from future inputs.
- (2)
- The architecture possesses the capability to map time series data of arbitrary length to output data of the same length, akin to RNN.
- (1)
- Causal convolution
- (2)
- Dilated convolution
- (3)
- Residual connection
2.4. Gated Recurrent Unit
2.5. Multi-Head Attention
2.6. Model Architecture and Modeling Steps
3. Model Evaluation Indicators
4. Results and Discussions
4.1. Application in Case 1
4.1.1. Data Description
4.1.2. Data Preprocessing
4.1.3. Parameters of the Models
- (1)
- High-frequency component: For the TCN module, the number of filters is 100, the kernel size is set to 2, and dilations = [1, 2, 4]. For the GRU module, the number of hidden units is 100. The dropout is set to 0.4. Adam is chosen as the optimizer. The batch size is 64, the maxepochs is 600, and the original learning rate is 0.001.
- (2)
- Low-frequency component: For the TCN module, the number of filters is 90, the kernel size is set to 3, and dilations = [1, 2, 4]. For the GRU module, the number of hidden units is 80. The dropout is set to 0.5. Adam is chosen as the optimizer. The batch size is 32, the maxepochs is 600, and the original learning rate is 0.001.
- (3)
- Residual component: For the TCN module, the number of filters is 96, the kernel size is set to 3, and dilations = [1, 2, 4]. For the GRU module, the number of hidden units is 16. The dropout is set to 0.5. Adam is chosen as the optimizer. The batch size is 32, the maxepochs is 600, and the original learning rate is 0.001.
4.1.4. Experimental Results and Discussions
4.2. Application in Case 2
4.2.1. Data Description
4.2.2. Data Preprocessing
4.2.3. Parameters of the Models
- (1)
- High-frequency component: For the TCN module, the number of filters is 32, the kernel size is set to 3, and dilations = [1, 2, 4]. For the GRU module, the number of hidden units is 60. The dropout is set to 0.5. Adam is chosen as the optimizer. The batch size is 32, the maxepochs is 600, and the original learning rate is 0.001.
- (2)
- Low-frequency component: For the TCN module, the number of filters is 32, the kernel size is set to 3, and dilations = [1, 2, 4]. For the GRU module, the number of hidden units is 32. The dropout is set to 0.2. Adam is chosen as the optimizer. The batch size is 32, the maxepochs is 600, and the original learning rate is 0.001.
- (3)
- Residual component: For the TCN module, the number of filters is 32, the kernel size is set to 3, and dilations = [1, 2, 4]. For the GRU module, the number of hidden units is 64. The dropout is set to 0.5. Adam is chosen as the optimizer. The batch size is 32, the maxepochs is 600, and the original learning rate is 0.001.
4.2.4. Experimental Results and Discussions
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Gharbi, R.B.; Mansoori, G.A. An introduction to artificial intelligence applications in petroleum exploration and production. J. Pet. Sci. Eng. 2005, 49, 93–96. [Google Scholar] [CrossRef]
- Mehrotra, R.; Gopalan, R. Factors influencing strategic decision-making processes for the oil/gas industries of UAE-A study. Int. J. Mark. Financ. Manag. 2017, 5, 62–69. [Google Scholar]
- Doublet, L.E.; Pande, P.K.; McCollum, T.J.; Blasingame, T.A. Decline curve analysis using type curves–analysis of oil well production data using material balance time: Application to field cases. In Proceedings of the SPE International Oil Conference and Exhibition in Mexico, Veracruz, Mexico, 10–13 October 1994; p. SPE-28688. [Google Scholar]
- Arps, J.J. Analysis of decline curves. Trans. AIME 1945, 160, 228–247. [Google Scholar] [CrossRef]
- Geng, L.; Li, G.; Wang, M.; Li, Y.; Tian, S.; Pang, W.; Lyu, Z. A fractal production prediction model for shale gas reservoirs. J. Nat. Gas Sci. Eng. 2018, 55, 354–367. [Google Scholar] [CrossRef]
- Chen, Y.; Ma, G.; Jin, Y.; Wang, H.; Wang, Y. Productivity evaluation of unconventional reservoir development with three-dimensional fracture networks. Fuel 2019, 244, 304–313. [Google Scholar] [CrossRef]
- Box, G.E.P.; Jenkins, G.M. Time Series Analysis: Forecasting and Control, Rev. ed.; Holden-Day Series in Time Series Analysis and Digital Processing; Holden-Day: San Francisco, CA, USA, 1976; ISBN 978-0-8162-1104-3. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar]
- Wu, P.; Sun, J.; Chang, X.; Zhang, W.; Arcucci, R.; Guo, Y.; Pain, C.C. Data-driven reduced order model with temporal convolutional neural network. Comput. Methods Appl. Mech. Eng. 2020, 360, 112766. [Google Scholar] [CrossRef]
- Zhou, C.D.; Wu, X.-L.; Cheng, J.-A. Determining reservoir properties in reservoir studies using a fuzzy neural network. In Proceedings of the SPE Annual Technical Conference and Exhibition, Houston, TX, USA, 3–6 October 1993. [Google Scholar]
- Pan, S.; Yang, B.; Wang, S.; Guo, Z.; Wang, L.; Liu, J.; Wu, S. Oil well production prediction based on CNN-LSTM model with self-attention mechanism. Energy 2023, 284, 128701. [Google Scholar] [CrossRef]
- Ahmadi, M.A.; Chen, Z. Machine learning models to predict bottom hole pressure in multi-phase flow in vertical oil production wells. Can. J. Chem. Eng. 2019, 97, 2928–2940. [Google Scholar] [CrossRef]
- Ali, M.; Zhu, P.; Jiang, R.; Huolin, M.; Ehsan, M.; Hussain, W.; Zhang, H.; Ashraf, U.; Ullaah, J. Reservoir characterization through comprehensive modeling of elastic logs prediction in heterogeneous rocks using unsupervised clustering and class-based ensemble machine learning. Appl. Soft Comput. 2023, 148, 110843. [Google Scholar] [CrossRef]
- Liu, Y.-Y.; Ma, X.-H.; Zhang, X.-W.; Guo, W.; Kang, L.-X.; Yu, R.-Z.; Sun, Y.-P. A deep-learning-based prediction method of the estimated ultimate recovery (EUR) of shale gas wells. Pet. Sci. 2021, 18, 1450–1464. [Google Scholar] [CrossRef]
- Maucec, M.; Garni, S. Application of automated machine learning for multi-variate prediction of well production. In Proceedings of the SPE Middle East Oil and Gas Show and Conference, Manama, Bahrain, 18–21 March 2019; p. D032S069R003. [Google Scholar]
- Khan, M.R.; Alnuaim, S.; Tariq, Z.; Abdulraheem, A. Machine learning application for oil rate prediction in artificial gas lift wells. In Proceedings of the SPE Middle East Oil and Gas Show and Conference, Manama, Bahrain, 18–21 March 2019; p. D032S085R002. [Google Scholar]
- Davtyan, A.; Rodin, A.; Muchnik, I.; Romashkin, A. Oil production forecast models based on sliding window regression. J. Pet. Sci. Eng. 2020, 195, 107916. [Google Scholar] [CrossRef]
- Huang, Z.; Chen, Z. Comparison of different machine learning algorithms for predicting the SAGD production performance. J. Pet. Sci. Eng. 2021, 202, 108559. [Google Scholar] [CrossRef]
- Al-Shabandar, R.; Jaddoa, A.; Liatsis, P.; Hussain, A.J. A deep gated recurrent neural network for petroleum production forecasting. Mach. Learn. Appl. 2021, 3, 100013. [Google Scholar] [CrossRef]
- Ng, C.S.W.; Jahanbani Ghahfarokhi, A.; Nait Amar, M. Well production forecast in Volve field: Application of rigorous machine learning techniques and metaheuristic algorithm. J. Pet. Sci. Eng. 2022, 208, 109468. [Google Scholar] [CrossRef]
- Safavian, S.R.; Landgrebe, D. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef]
- Cho, Y.H.; Kim, J.K.; Kim, S.H. A personalized recommender system based on web usage mining and decision tree induction. Expert Syst. Appl. 2002, 23, 329–342. [Google Scholar] [CrossRef]
- Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J.C.; Sheridan, R.P.; Feuston, B.P. Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling. J. Chem. Inf. Comput. Sci. 2003, 43, 1947–1958. [Google Scholar] [CrossRef]
- Larivière, B.; Van den Poel, D. Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Syst. Appl. 2005, 29, 472–484. [Google Scholar] [CrossRef]
- Prinzie, A.; Van den Poel, D. Random forests for multiclass classification: Random multinomial logit. Expert Syst. Appl. 2008, 34, 1721–1732. [Google Scholar] [CrossRef]
- Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.-C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A 1998, 454, 903–995. [Google Scholar] [CrossRef]
- Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
- Torres, M.E.; Colominas, M.A.; Schlotthauer, G.; Flandrin, P. A complete ensemble empirical mode decomposition with adaptive noise. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar]
- Zhao, W.; Gao, Y.; Ji, T.; Wan, X.; Ye, F.; Bai, G. Deep temporal convolutional networks for short-term traffic flow forecasting. IEEE Access 2019, 7, 114496–114507. [Google Scholar] [CrossRef]
- Hewage, P.; Behera, A.; Trovati, M.; Pereira, E.; Ghahremani, M.; Palmieri, F.; Liu, Y. Temporal convolutional neural (TCN) network for an effective weather forecasting using time-series data from the local weather station. Soft Comput. 2020, 24, 16453–16482. [Google Scholar] [CrossRef]
- Zhu, J.; Su, L.; Li, Y. Wind power forecasting based on new hybrid model with TCN residual modification. Energy AI 2022, 10, 100199. [Google Scholar] [CrossRef]
- Tang, Y.; Yang, K.; Zhang, S.; Zhang, Z. Wind power forecasting: A temporal domain generalization approach incorporating hybrid model and adversarial relationship-based training. Appl. Energy 2024, 355, 122266. [Google Scholar] [CrossRef]
- Kumar, P.; Hati, A.S. Dilated convolutional neural network based model for bearing faults and broken rotor bar detection in squirrel cage induction motors. Expert Syst. Appl. 2022, 191, 116290. [Google Scholar] [CrossRef]
- Zhu, R.; Liao, W.; Wang, Y. Short-term prediction for wind power based on temporal convolutional network. Energy Rep. 2020, 6, 424–429. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 July 2016; pp. 770–778. [Google Scholar]
- Yin, L.; Wu, Y. Traffic flow combination prediction model based on improved VMD-GAT-GRU. J. Electron. Meas. Instrum. 2022, 36, 62–72. [Google Scholar]
- Mercat, J.; Gilles, T.; El Zoghby, N.; Sandou, G.; Beauvois, D.; Gil, G.P. Multi-head attention for multi-modal joint vehicle motion forecasting. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 9638–9644. [Google Scholar]
- Wang, J.; Wang, Y.; Li, Z.; Li, H.; Yang, H. A combined framework based on data preprocessing, neural networks and multi-tracker optimizer for wind speed prediction. Sustain. Energy Technol. Assess. 2020, 40, 100757. [Google Scholar] [CrossRef]
- Yazici, I.; Beyca, O.F.; Delen, D. Deep-learning-based short-term electricity load forecasting: A real case application. Eng. Appl. Artif. Intell. 2022, 109, 104645. [Google Scholar] [CrossRef]
- Kosana, V.; Teeparthi, K.; Madasthu, S. A novel and hybrid framework based on generative adversarial network and temporal convolutional approach for wind speed prediction. Sustain. Energy Technol. Assess. 2022, 53, 102467. [Google Scholar] [CrossRef]
- Li, Y.; Zuo, Z.; Pan, J. Sensor-based fall detection using a combination model of a temporal convolutional network and a gated recurrent unit. Future Gener. Comput. Syst. 2023, 139, 53–63. [Google Scholar] [CrossRef]
- Time Series Forecasting of Oil Production in Enhanced Oil Recovery System Based on a Novel CNN-GRU Neural Network—ScienceDirect. Available online: https://www.sciencedirect.com/science/article/pii/S2949891023011156 (accessed on 6 March 2024).
- Zha, W.; Liu, Y.; Wan, Y.; Luo, R.; Li, D.; Yang, S.; Xu, Y. Forecasting monthly gas field production based on the CNN-LSTM model. Energy 2022, 260, 124889. [Google Scholar] [CrossRef]
- Raj, N.; Prakash, R. Assessment and prediction of significant wave height using hybrid CNN-BiLSTM deep learning model for sustainable wave energy in Australia. Sustain. Horiz. 2024, 11, 100098. [Google Scholar] [CrossRef]
Maximum Value | Minimum Value | Average Value | Standard Deviation | Coefficient of Variation | Kurtosis | Skewness | ||
---|---|---|---|---|---|---|---|---|
The target variable | Monthly oil production (104 bbl) | 13.2 | 1.6 | 6.0 | 2.1 | 0.3 | 0.3 | 0.3 |
The input data | Operation months (m) | 118.0 | 1.0 | 59.5 | 34.2 | 0.6 | −1.2 | 0.0 |
Active days (d) | 31.0 | 12.8 | 29.0 | 3.7 | 0.1 | 8.2 | −2.8 | |
Choke Size (in) | 64.0 | 32.0 | 52.3 | 6.9 | 0.1 | 0.0 | 0.0 | |
GOR (m3/m3) | 829.1 | 438.5 | 628.2 | 84.5 | 0.1 | −0.1 | 0.2 | |
Water Cut (%) | 4.7 | 0.0 | 0.7 | 1.1 | 1.6 | 4.1 | 2.1 | |
FLP (Psi) | 220.0 | 145.0 | 181.9 | 22.3 | 0.1 | −1.0 | −0.5 | |
FLT (°C) | 66.0 | 40.0 | 57.0 | 6.2 | 0.1 | 0.0 | −0.9 | |
THP (Psi) | 1051.0 | 280.0 | 508.2 | 163.7 | 0.3 | 3.4 | 1.9 |
Depth of the Tree | Number of Trees | The Top Five Features (Importance) | ||||
---|---|---|---|---|---|---|
5 | 50 | Operation Months (3.10) | Choke Size (1.16) | GOR (0.47) | FLP (0.36) | Active Days (0.25) |
100 | Operation Months (3.21) | Choke Size (1.12) | GOR (0.40) | THP (0.25) | Active Days (0.23) | |
150 | Operation Months (2.97) | Choke Size (1.04) | GOR (0.43) | FLP (0.31) | Active Days (0.26) | |
10 | 50 | Operation Months (3.73) | Choke Size (1.26) | GOR (0.53) | Active Days (0.44) | FLP (0.38) |
100 | Operation Months (3.45) | Choke Size (1.15) | GOR (0.49) | Active Days (0.48) | FLT (0.36) | |
150 | Operation Months (3.19) | Choke Size (1.07) | GOR (0.50) | Active Days (0.41) | THP (0.39) | |
15 | 50 | Operation Months (3.90) | Choke Size (1.22) | GOR (0.61) | Active Days (0.57) | FLT (0.43) |
100 | Operation Months (3.38) | Choke Size (1.13) | GOR (0.64) | Active Days (0.55) | FLT (0.45) | |
150 | Operation Months (3.15) | Choke Size (1.04) | GOR (0.62) | Active Days (0.57) | THP (0.47) |
T-Test | Test Value = 602 | ||||
---|---|---|---|---|---|
Size | t | Prob. | Mean Value of IMFx | Standard Deviation | |
IMF1 | 118 | −0.10 | 0.918 | 537.99 | 6730.64 |
IMF2 | 118 | −4.83 | 0.000 | −47.37 | 1461.56 |
IMF3 | 118 | −2.01 | 0.046 | −93.57 | 3750.78 |
… | … | … | … | … | … |
Model | Error Metrics of Training Set | Error Metrics of Testing Set | Times (s) | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE (104 bbl) | MAE (104 bbl) | MAPE | R2 | RMSE (104 bbl) | MAE (104 bbl) | MAPE | R2 | ||
CNN-GRU | 0.447 | 0.313 | 0.052 | 0.949 | 0.738 | 0.639 | 0.157 | 0.573 | 6.98 |
CNN-LSTM | 0.575 | 0.385 | 0.063 | 0.917 | 0.728 | 0.608 | 0.147 | 0.584 | 7.18 |
CNN-BILSTM | 0.597 | 0.410 | 0.069 | 0.911 | 0.748 | 0.577 | 0.156 | 0.562 | 7.32 |
RF-CNN-GRU | 0.630 | 0.397 | 0.064 | 0.900 | 0.667 | 0.570 | 0.174 | 0.651 | 6.81 |
RF-CNN-LSTM | 0.637 | 0.442 | 0.073 | 0.898 | 0.669 | 0.576 | 0.142 | 0.650 | 6.92 |
RF-CNN-BILSTM | 0.617 | 0.389 | 0.064 | 0.905 | 0.664 | 0.547 | 0.172 | 0.654 | 7.30 |
CNN-GRU-MA | 0.538 | 0.368 | 0.061 | 0.927 | 0.693 | 0.565 | 0.136 | 0.623 | 6.83 |
CNN-LSTM-MA | 0.654 | 0.427 | 0.069 | 0.893 | 0.697 | 0.607 | 0.183 | 0.620 | 7.38 |
CNN-BILSTM-MA | 0.537 | 0.358 | 0.058 | 0.928 | 0.698 | 0.589 | 0.146 | 0.618 | 7.47 |
RF-CNN-GRU-MA | 0.623 | 0.393 | 0.063 | 0.903 | 0.638 | 0.550 | 0.166 | 0.681 | 6.87 |
RF-CNN-LSTM-MA | 0.653 | 0.429 | 0.070 | 0.893 | 0.646 | 0.565 | 0.170 | 0.673 | 6.98 |
RF-CNN-BILSTM-MA | 0.621 | 0.387 | 0.063 | 0.903 | 0.645 | 0.545 | 0.168 | 0.674 | 7.30 |
Proposed Approach | 0.319 | 0.256 | 0.049 | 0.974 | 0.619 | 0.541 | 0.145 | 0.699 | 99.2 |
Maximum Value | Minimum Value | Average Value | Standard Deviation | Coefficient of Variation | Kurtosis | Skewness | ||
---|---|---|---|---|---|---|---|---|
The target variable | Monthly oil production (104 bbl) | 25.9 | 4.31 | 15.09 | 5.53 | 0.37 | −1.02 | −0.05 |
The input data | Operation months (m) | 279.00 | 1.00 | 140.00 | 80.68 | 0.58 | −1.20 | 0.00 |
Average active days (d) | 31.00 | 23.00 | 30.20 | 1.36 | 0.05 | 7.72 | −2.61 | |
Water injection (104 bbl) | 37.32 | 9.63 | 23.29 | 6.19 | 0.27 | −0.38 | 0.15 | |
Water cut (%) | 52.10 | 0.19 | 16.59 | 14.24 | 0.86 | −0.85 | 0.54 | |
Oil wells | 61.00 | 19.00 | 42.89 | 13.15 | 0.31 | −1.03 | −0.70 | |
Water injection wells | 20.00 | 2.00 | 9.24 | 4.63 | 0.50 | −0.14 | 0.80 | |
Total wells | 71.00 | 23.00 | 52.13 | 16.61 | 0.32 | −1.15 | −0.68 | |
Injection–production radio | 1.92 | 0.40 | 0.73 | 0.32 | 0.44 | 2.38 | 1.69 | |
GOR (m3/m3) | 1.00 | 0.22 | 0.66 | 0.28 | 0.42 | −1.52 | −0.13 |
Depth of the Tree | Number of Trees | The Top Five Features (Importance) | ||||
---|---|---|---|---|---|---|
5 | 50 | GOR (1.27) | Operation Months (1.15) | Total wells (0.80) | Water Injection (0.45) | Water Injection Wells (0.43) |
100 | GOR (1.33) | Operation Months (1.08) | Total wells (0.84) | Water Injection (0.45) | Water Injection Wells (0.39) | |
150 | GOR (1.39) | Operation Months (1.13) | Total wells (0.78) | Water Injection (0.43) | Water Injection Wells (0.40) | |
10 | 50 | GOR (1.54) | Operation Months (1.25) | Total wells (0.80) | Injection–Production Ratio (0.55) | Water Injection (0.53) |
100 | GOR (1.52) | Operation Months (1.15) | Total wells (0.81) | Water Injection (0.57) | Injection–Production Ratio (0.44) | |
150 | GOR (1.57) | Operation Months (1.19) | Total wells (0.77) | Water Injection (0.54) | Injection–Production Ratio (0.41) | |
15 | 50 | GOR (1.58) | Operation Months (1.26) | Total wells (0.81) | Injection–Production Ratio (0.58) | Water Injection (0.56) |
100 | GOR (1.59) | Operation Months (1.16) | Total wells (0.83) | Water Injection (0.59) | Injection–Production Ratio (0.48) | |
150 | GOR (1.60) | Operation Months (1.20) | Total wells (0.77) | Water Injection (0.57) | Injection–Production Ratio (0.46) |
T-Test | Test Value = 1509 | ||||
---|---|---|---|---|---|
Size | t | Prob. | Mean Value of IMFx | Standard Deviation | |
IMF1 | 279 | −1.96 | 0.051 | 470.19 | 8758.74 |
IMF2 | 279 | −11.19 | <0.001 | −10.26 | 2253.51 |
IMF3 | 279 | −6.66 | <0.001 | −52.43 | 3631.38 |
… | … | … | … | … | … |
Model | Error Metrics of Training Set | Error Metrics of Testing Set | Times (s) | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE (104 bbl) | MAE (104 bbl) | MAPE | R2 | RMSE (104 bbl) | MAE (104 bbl) | MAPE | R2 | ||
CNN-GRU | 1.441 | 1.785 | 0.088 | 0.846 | 1.232 | 1.491 | 0.183 | 0.635 | 7.11 |
CNN-LSTM | 1.358 | 1.021 | 0.062 | 0.911 | 1.431 | 1.101 | 0.182 | 0.664 | 8.03 |
CNN-BILSTM | 1.642 | 1.309 | 0.080 | 0.869 | 1.427 | 1.237 | 0.168 | 0.666 | 7.90 |
RF-CNN-GRU | 1.728 | 1.390 | 0.084 | 0.855 | 1.450 | 1.258 | 0.169 | 0.654 | 6.78 |
RF-CNN-LSTM | 1.554 | 1.229 | 0.074 | 0.883 | 1.422 | 1.120 | 0.173 | 0.668 | 7.31 |
RF-CNN-BILSTM | 1.733 | 1.400 | 0.086 | 0.855 | 1.298 | 1.089 | 0.162 | 0.723 | 7.51 |
CNN-GRU-MA | 1.743 | 1.405 | 0.085 | 0.853 | 1.378 | 1.108 | 0.173 | 0.688 | 7.50 |
CNN-LSTM-MA | 1.321 | 1.028 | 0.062 | 0.916 | 1.342 | 1.088 | 0.174 | 0.704 | 8.19 |
CNN-BILSTM-MA | 1.499 | 1.154 | 0.067 | 0.891 | 1.312 | 1.052 | 0.165 | 0.717 | 8.33 |
RF-CNN-GRU-MA | 1.422 | 1.080 | 0.065 | 0.902 | 1.294 | 1.051 | 0.164 | 0.725 | 7.23 |
RF-CNN-LSTM-MA | 1.318 | 1.027 | 0.062 | 0.916 | 1.244 | 1.028 | 0.154 | 0.746 | 7.47 |
RF-CNN-BILSTM-MA | 1.453 | 1.106 | 0.066 | 0.898 | 1.226 | 1.004 | 0.155 | 0.753 | 7.80 |
Proposed Approach | 1.387 | 1.055 | 0.066 | 0.907 | 1.131 | 0.927 | 0.137 | 0.788 | 103.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fan, Z.; Liu, X.; Wang, Z.; Liu, P.; Wang, Y. A Novel Ensemble Machine Learning Model for Oil Production Prediction with Two-Stage Data Preprocessing. Processes 2024, 12, 587. https://doi.org/10.3390/pr12030587
Fan Z, Liu X, Wang Z, Liu P, Wang Y. A Novel Ensemble Machine Learning Model for Oil Production Prediction with Two-Stage Data Preprocessing. Processes. 2024; 12(3):587. https://doi.org/10.3390/pr12030587
Chicago/Turabian StyleFan, Zhe, Xiusen Liu, Zuoqian Wang, Pengcheng Liu, and Yanwei Wang. 2024. "A Novel Ensemble Machine Learning Model for Oil Production Prediction with Two-Stage Data Preprocessing" Processes 12, no. 3: 587. https://doi.org/10.3390/pr12030587