Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Explainable finite mixture of mixtures of bounded asymmetric generalized Gaussian and Uniform distributions learning for energy demand management

Published: 18 June 2024 Publication History

Abstract

We introduce a mixture of mixtures of bounded asymmetric generalized Gaussian and uniform distributions. Based on this framework, we propose model-based classification and model-based clustering algorithms. We develop an objective function for the minimum message length (MML) model selection criterion to discover the optimal number of clusters for the unsupervised approach of our proposed model. Given the crucial attention received by Explainable AI (XAI) in recent years, we introduce a method to interpret the predictions obtained from the proposed model in both learning settings by defining their boundaries in terms of the crucial features. Integrating Explainability within our proposed algorithm increases the credibility of the algorithm’s predictions since it would be explainable to the user’s perspective through simple If-Then statements using a small binary decision tree. In this paper, the proposed algorithm proves its reliability and superiority to several state-of-the-art machine learning algorithms within the following real-world applications: fault detection and diagnosis (FDD) in chillers, occupancy estimation and categorization of residential energy consumers.

References

[1]
Y. Agarwal, B. Balaji, R. Gupta, J. Lyles, M. Wei, and T. Weng. 2010. Occupancy-driven energy management for smart building automation. Proceedings of the 2nd ACM Workshop on Embedded Sensing Systems for Energy-efficiency in Buildings (2010), 1–6.
[2]
Yudi Agusta and David L. Dowe. 2003. Unsupervised learning of gamma mixture models using minimum message length. In Proceedings of the 3rd IASTED Conference on Artificial Intelligence and Applications. ACTA Press Benalmadena, 457–462.
[3]
Alexander Craig Aitken. 1927. Xxv.on bernoulli.s numerical solution of algebraic equations. In Proceedings of the Royal Society of Edinburgh 46 (1927), 289–305.
[4]
H. A. Akaike. 1974. New look at the statistical model identification. IEEE Transactions on Automatic Control 19 (1974), 716–723.
[5]
M. Allili. 2011. Wavelet modeling using finite mixtures of generalized Gaussian distributions: Application to texture discrimination and retrieval. IEEE Transactions on Image Processing 21 (2011), 1452–1464.
[6]
M. Allili and N. Baaziz. 2011. Contourlet-based texture retrieval using a mixture of generalized Gaussian distributions. International Conference on Computer Analysis of Images and Patterns (2011), 446–454.
[7]
M. Allili, N. Bouguila, and D. Ziou. 2008. Finite general Gaussian mixture modeling and application to image and video foreground segmentation. Journal of Electronic Imaging 17 (2008), 013005.
[8]
M. Allili, N. Bouguila, and D. A. Ziou. 2007. Robust video foreground segmentation by using generalized Gaussian mixture modeling. Fourth Canadian Conference on Computer and Robot Vision (CRV’07) (2007), 503–509.
[9]
M. Allili, D. Ziou, N. Bouguila, and S. Boutemedjet. 2010. Image and video segmentation by combining unsupervised generalized Gaussian mixture modeling and feature selection. IEEE Transactions on Circuits and Systems for Video Technology 20 (2010), 1373–1377.
[10]
M. Amayri, A. Arora, S. Ploix, S. Bandhyopadyay, Q. Ngo, and V. Badarla. 2016. Estimating occupancy in heterogeneous sensor environment. Energy and Buildings 129 (2016), 46–58.
[11]
M. Amayri and S. Ploix. 2018. Decision tree and parametrized classifier for estimating occupancy in energy management. 2018 5th International Conference on Control, Decision and Information Technologies (CoDIT) (2018), 397–402.
[12]
Plamen P. Angelov, Eduardo A. Soares, Richard Jiang, Nicholas I. Arnold, and Peter M. Atkinson. 2021. Explainable artificial intelligence: An analytical review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 11, 5 (2021), e1424.
[13]
D. Ap. 1977. Maximum likelihood from incomplete data via em algorithm. J. Royal Stat. Soc. B 39 (1977), 1–38.
[14]
A. Ayebo and T. Kozubowski. 2003. An asymmetric generalization of Gaussian and Laplace laws. Journal of Probability and Statistical Science 1 (2003), 187–210.
[15]
Pierre Baldi, Søren Brunak, Yves Chauvin, Claus A. F. Andersen, and Henrik Nielsen. 2000. Assessing the accuracy of prediction algorithms for classification: An overview. Bioinformatics 16, 5 (2000), 412–424.
[16]
J. Banfield and A. Raftery. 1993. Model-based Gaussian and non-Gaussian clustering. Biometrics (1993), 803–821.
[17]
N. Bouguila and D. A. Ziou. 2009. Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling. IEEE Transactions on Neural Networks 21 (2009), 107–122.
[18]
A. Boulmerka and M. Allili. 2012. Thresholding-based segmentation revisited using mixtures of generalized Gaussian distributions. Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012) (2012), 2894–2897.
[19]
R. Browne, P. McNicholas, and M. Sparling. 2011. Model-based learning using a mixture of mixtures of Gaussian and uniform distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (2011), 814–817.
[20]
K. Bruton, P. Raftery, P. O’Donovan, N. Aughney, M. Keane, and D. O’Sullivan. 2014. Development and alpha testing of a cloud based automated fault detection and diagnosis tool for air handling units. Automation in Construction 39 (2014), 70–83.
[21]
Tadeusz Caliński and Jerzy Harabasz. 1974. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods 3, 1 (1974), 1–27.
[22]
M. Comstock, J. Braun, and E. A. Groll. 2002. Survey of common faults for chillers/discussion. ASHRAE Transactions 108 (2002).
[23]
John Horton Conway and Neil James Alexander Sloane. 2013. Sphere Packings, Lattices and Groups. Vol. 290. Springer Science & Business Media.
[24]
Michal Moshkovitz, Sanjoy Dasgupta, Cyrus Rashtchian, and Nave Frost. 2020. Explainable k-means and k-medians clustering. In International conference on machine learning. PMLR, 7055–7065.
[25]
David L. Davies and Donald W. Bouldin. 1979. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence2 (1979), 224–227.
[26]
D. Delaney, G. O’Hare, and A. Ruzzelli. 2009. Evaluation of energy-efficiency in lighting systems using sensor networks. Proceedings of the First ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings (2009), 61–66.
[27]
L. Diao, Y. Sun, Z. Chen, and J. Chen. 2017. Modeling energy consumption in residential buildings: A bottom-up analysis based on occupant behavior pattern clustering and stochastic simulation. Energy and Buildings 147 (2017), 47–66.
[28]
P. Dongre, A. Aldrees, and D. Gracanin. 2021. Clustering appliance energy consumption data for occupant energy-behavior modeling. Proceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (2021), 290–293.
[29]
M. Ebling and M. Corner. 2009. It’s all about power and those pesky power vampires. IEEE Pervasive Computing 8 (2009), 12–13.
[30]
T. Elguebaly and N. A. Bouguila. 2011. Nonparametric Bayesian approach for enhanced pedestrian detection and foreground segmentation. CVPR 2011 Workshops (2011), 21–26.
[31]
V. Erickson and A. Cerpa. 2010. Occupancy based demand response HVAC control strategy. Proceedings of the 2nd ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings (2010), 7–12.
[32]
V. Erickson, Y. Lin, A. Kamthe, R. Brahme, A. Surana, A. Cerpa, M. Sohn, and S. Narayanan. 2009. Energy efficient building environment control strategies using real-time occupancy measurements. Proceedings of the First ACM Workshop on Embedded Sensing Systems for Energy-efficiency in Buildings (2009), 19–24.
[33]
A. Farag, A. El-Baz, and G. Gimel’farb. 2006. Precise segmentation of multimodal images. IEEE Transactions on Image Processing 15 (2006), 952–968.
[34]
Nave Frost, Michal Moshkovitz, and Cyrus Rashtchian. 2020. ExKMC: Expanding explainable \(k\)-means clustering. arXiv preprint arXiv:2006.02399 (2020).
[35]
M. Goldstein and A. Dengel. 2012. Histogram-based outlier score (HBOS): A fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track 9 (2012).
[36]
S. Haben, C. Singleton, and P. Grindrod. 2015. Analysis and clustering of residential customers energy behavioral demand using smart meter data. IEEE Transactions on Smart Grid 7 (2015), 136–144.
[37]
F. Haldi and D. Robinson. 2009. Interactions with window openings by office occupants. Building and Environment 44 (2009), 2378–2395.
[38]
P. Hedelin and J. Skoglund. 2000. Vector quantization based on Gaussian mixture models. IEEE Transactions on Speech and Audio Processing 8 (2000), 385–401.
[39]
Félix Iglesias, Tanja Zseby, and Arthur Zimek. 2019. Absolute cluster validity. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 9 (2019), 2096–2112.
[40]
G. M. Kaler Jr. 1988. Expert system predicts service. Heating, Piping and Air Conditioning 60, 11 (1988), 99–101.
[41]
Ankur Kamthe, Varick Erickson, Miguel Á Carreira-Perpiñán, and Alberto Cerpa. 2011. Enabling building energy auditing using adapted occupancy models. In Proceedings of the Third ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings. 31–36.
[42]
A. Kamthe, L. Jiang, M. Dudys, and A. Cerpa. 2009. SCOPES: Smart cameras object position estimation system. European Conference on Wireless Sensor Networks (2009), 279–295.
[43]
Ayesha Kashif, Julie Dugdale, and Stephane Ploix. 2013. Simulating occupants’ behavior for energy waste reduction in dwellings: A multiagent methodology. Advances in Complex Systems 16, 04n05 (2013), 1350022.
[44]
T. Kieu, B. Yang, C. Guo, and C. Jensen. 2019. Outlier detection for time series with recurrent autoencoder ensembles. IJCAI (2019), 2725–2732.
[45]
K. Kokkinakis and A. Nandi. 2005. Exponent parameter estimation for generalized Gaussian probability density functions with application to speech modeling. Signal Processing 85 (2005), 1852–1858.
[46]
J. Lee and A. Nandi. 1999. Parameter estimation of the asymmetric generalised Gaussian family of distributions. IEE Colloquium on Statistical Signal Processing (Ref. No. 1999/002) (1999), 9–1.
[47]
B. Li, F. Cheng, X. Zhang, C. Cui, and W. A Cai. 2021. Novel semi-supervised data-driven method for chiller fault diagnosis with unlabeled data. Applied Energy 285 (2021).
[48]
G. Li and Y. Hu. 2019. An enhanced PCA-based chiller sensor fault detection method using ensemble empirical mode decomposition based denoising. Energy and Buildings 183 (2019), 311–324.
[49]
J. Lindblom and J. Samuelsson. 2003. Bounded support Gaussian mixture modeling of speech spectra. IEEE Transactions on Speech and Audio Processing 11 (2003), 88–99.
[50]
Z. Lipton. 2018. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 16 (2018), 31–57.
[51]
J. Lu, T. Sookoor, V. Srinivasan, G. Gao, B. Holben, J. Stankovic, E. Field, and K. Whitehouse. 2010. The smart thermostat:Using occupancy sensors to save energy in homes. Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems (2010), 211–224.
[52]
Scott M. Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (2017).
[53]
C. Martani, D. Lee, P. Robinson, R. Britter, and C. Enernet Ratti. 2012. Studying the dynamic relationship between building occupancy and energy consumption. Energy and Buildings 47 (2012), 584–591.
[54]
Christoph Molnar, Giuseppe Casalicchio, and Bernd Bischl. 2020. Interpretable machine learning–a brief history, state-of-the-art and challenges. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 417–431.
[55]
W. James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu. 2019. Interpretable machine learning: definitions, methods, and applications. arXiv preprint arXiv:1901.04592 (2019).
[56]
D. Murray, J. Liao, L. Stankovic, V. Stankovic, R. Hauxwell-Baldwin, C. Wilson, M. Coleman, T. Kane, and S. A Firth. 2015. Data management platform for personalised real-time energy feedback. Proceedings of the 8th International Conference on Energy Efficiency in Domestic Appliances and Lighting (2015).
[57]
N. Nacereddine, S. Tabbone, D. Ziou, and L. Hamami. 2010. Asymmetric generalized Gaussian mixture models and EM algorithm for image segmentation. 2010 20th International Conference on Pattern Recognition (2010), 4557–4560.
[58]
T. Nguyen, Q. Wu, and H. Zhang. 2014. Bounded generalized Gaussian mixture model. Pattern Recognition 47 (2014), 3132–3142.
[59]
J. Page, D. Robinson, and J. Scartezzini. 2007. Stochastic simulation of occupant presence and behaviour in buildings. In Proc. Tenth Int. IBPSA Conf: Building Simulation. 757–764.
[60]
G. Pang, L. Cao, L. Chen, and H. Liu. 2018. Learning representations of ultrahigh-dimensional data for random distance-based outlier detection. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018), 2041–2050.
[61]
Kumar Prabhakaran, Jawher Dridi, Manar Amayri, and Nizar Bouguila. 2022. Explainable k-means clustering for occupancy estimation. Procedia Computer Science 203 (2022), 326–333.
[62]
T. Reddy and K. Andersen. 2002. An evaluation of classical steady-state off-line linear parameter estimation methods applied to chiller performance data. HVAC&R Research 8 (2002), 101–124.
[63]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1135–1144.
[64]
J. Rissanen. 1978. Modeling by shortest data description. Automatica 14 (1978), 465–471.
[65]
C. Roulet, P. Cretton, R. Fritsch, and J. Scartezzini. 1991. Stochastic model of inhabitant behavior with regard to ventilation. Proceedings of the 12th AIVC Conference, Ottawa (1991).
[66]
Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20 (1987), 53–65.
[67]
M. Sadooghi and S. Khadem. 2018. Improving one class support vector machine novelty detection scheme using nonlinear features. Pattern Recognition 83 (2018), 14–33.
[68]
J. Wagner and R. Shoureshi. 1992. Failure detection diagnostics for thermofluid systems. Journal of Dynamic Systems, Measurement and Control 114, 4 (1992), 699–706.
[69]
C. Wallace and D. Boulton. 1968. An information measure for classification. Comput. J. 11 (1968), 185–194.
[70]
Chris S. Wallace and David L. Dowe. 2000. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10, 1 (2000), 73–83.
[71]
Chris S. Wallace and Peter R. Freeman. 1987. Estimation and inference by compact coding. Journal of the Royal Statistical Society: Series B (Methodological) 49, 3 (1987), 240–252.
[72]
S. Wang and X. Jin. 1998. CO2-based occupancy detection for on-line outdoor air flow control. Indoor and Built Environment 7 (1998), 165–181.
[73]
Guanglin Xu. 2012. HVAC System Study: A Data-driven Approach. Ph. D. Dissertation. The University of Iowa.
[74]
K. Yan, A. Chong, and Y. Mo. 2020. Generative adversarial network for fault detection diagnosis of chillers. Building and Environment 172 (2020).
[75]
H. Yang, T. Zhang, H. Li, D. Woradechjumroen, and X. Liu. 2014. HVAC equipment, unitary: Fault detection and diagnosis. Encyclopedia of Energy Engineering and Technology, Second Edition (2014), 854–864.
[76]
M. Yang, C. Lai, and C. A Lin. 2012. Robust EM clustering algorithm for Gaussian mixture models. Pattern Recognition 45 (2012), 3950–3961.
[77]
R. Yang, K. Li, E. Lemarchand, and T. Fen-Chong. 2016. Micromechanics modeling the solute diffusivity of unsaturated granular materials. International Journal of Multiphase Flow 79 (2016), 1–9.
[78]
R. Yang and G. Rizzoni. 2016. Comparison of model-based vs. data-driven methods for fault detection and isolation in engine idle speed control system. Annual Conference of the PHM Society 8 (2016).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 15, Issue 4
August 2024
563 pages
EISSN:2157-6912
DOI:10.1145/3613644
  • Editor:
  • Huan Liu
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2024
Online AM: 26 March 2024
Accepted: 04 March 2024
Revised: 24 October 2023
Received: 16 November 2022
Published in TIST Volume 15, Issue 4

Check for updates

Author Tags

  1. Multivariate Statistics
  2. Mixture Distributions
  3. Maximum-likelihood Classification
  4. Optimization
  5. Sensors
  6. Mobile Applications

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 191
    Total Downloads
  • Downloads (Last 12 months)191
  • Downloads (Last 6 weeks)31
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media