Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Outlying Aspect Mining via Sum-Product Networks

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13935))

Included in the following conference series:

  • 1542 Accesses

Abstract

Outlying Aspect Mining (OAM) is the task of identifying a subset of features that distinguish an outlier from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They need to compute outlier scores for all examined subsets, and thus rely on simple outlier scoring algorithms.

In this paper, we propose SOAM, a novel OAM algorithm based on Sum-Product Networks (SPNs), a class of probabilistic circuits that can accurately model high-dimensional distributions. Our approach needs to fit an SPN only once, and leverages the tractability of marginal inference in SPNs to compute outlier scores in feature subsets. This way, computing outlier scores in subsets is fast, while being based on a flexible and accurate density estimator. We empirically show that SOAM clearly outperform the state-of-the-art method in search-based OAM, and even outperforms recent deep learning-based methods in the majority of the investigated cases. (Available at github.com/stefanluedtke/Sum-Product-Network-OAM).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Available at www.ipd.kit.edu/mitarbeiter/muellere/HiCS.

  2. 2.

    Available at github.com/xuhongzuo/outlier-interpretation.

  3. 3.

    github.com/xuhongzuo/outlier-interpretation.

References

  1. Aggarwal, C.C.: Data Mining. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-14142-8

    Book  Google Scholar 

  2. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)

    Google Scholar 

  3. Duan, L., Tang, G., Pei, J., Bailey, J., Campbell, A., Tang, C.: Mining outlying aspects on numeric data. Data Min. Knowl. Discov. 29(5), 1116–1151 (2015). https://doi.org/10.1007/s10618-014-0398-2

    Article  MathSciNet  Google Scholar 

  4. Gens, R., Domingos, P.: Learning the structure of sum-product networks. In: International Conference on Machine Learning, pp. 873–880. PMLR (2013)

    Google Scholar 

  5. Goldstein, M., Dengel, A.: Histogram-based outlier score (HBOS): a fast unsupervised anomaly detection algorithm. In: KI-2012: Poster and Demo Track 9 (2012)

    Google Scholar 

  6. Keller, F., Muller, E., Bohm, K.: HICS: high contrast subspaces for density-based outlier ranking. In: 2012 IEEE 28th International Conference on Data Engineering, pp. 1037–1048. IEEE (2012)

    Google Scholar 

  7. Li, Z., Zhao, Y., Botta, N., Ionescu, C., Hu, X.: COPOD: copula-based outlier detection. In: 2020 IEEE International Conference on Data Mining (ICDM), pp. 1118–1123. IEEE (2020)

    Google Scholar 

  8. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE (2008)

    Google Scholar 

  9. Liu, N., Shin, D., Hu, X.: Contextual outlier interpretation. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 2461–2467 (2018)

    Google Scholar 

  10. Molina, A., Vergari, A., Di Mauro, N., Natarajan, S., Esposito, F., Kersting, K.: Mixed sum-product networks: a deep architecture for hybrid domains. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  11. Molina, A., et al.: SPFlow: an easy and extensible library for deep probabilistic learning using sum-product networks (2019)

    Google Scholar 

  12. Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. (CSUR) 54(2), 1–38 (2021)

    Article  Google Scholar 

  13. Peharz, R., et al.: Einsum networks: fast and scalable learning of tractable probabilistic circuits. In: International Conference on Machine Learning, pp. 7563–7574. PMLR (2020)

    Google Scholar 

  14. Peharz, R., Tschiatschek, S., Pernkopf, F., Domingos, P.: On theoretical properties of sum-product networks. In: Artificial Intelligence and Statistics, pp. 744–752. PMLR (2015)

    Google Scholar 

  15. Pimentel, M.A., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty detection. Signal Process. 99, 215–249 (2014)

    Article  Google Scholar 

  16. Poon, H., Domingos, P.: Sum-product networks: a new deep architecture. In: Proceeding of the UAI (2011)

    Google Scholar 

  17. Samariya, D., Aryal, S., Ting, K.M., Ma, J.: A new effective and efficient measure for outlying aspect mining. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds.) WISE 2020. LNCS, vol. 12343, pp. 463–474. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62008-0_32

    Chapter  Google Scholar 

  18. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  Google Scholar 

  19. Schubert, E., Zimek, A., Kriegel, H.P.: Generalized outlier detection with flexible kernel density estimates. In: Proceedings of the 2014 SIAM International Conference on Data Mining, pp. 542–550. SIAM (2014)

    Google Scholar 

  20. Venkataramanan, S., Peng, K.-C., Singh, R.V., Mahalanobis, A.: Attention guided anomaly localization in images. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 485–503. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_29

    Chapter  Google Scholar 

  21. Vergari, A., Di Mauro, N., Esposito, F.: Simplifying, regularizing and strengthening sum-product network structure learning. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Gama, J., Jorge, A., Soares, C. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9285, pp. 343–358. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23525-7_21

    Chapter  Google Scholar 

  22. Vinh, N.X., et al.: Discovering outlying aspects in large datasets. Data Min. Knowl. Discov. 30(6), 1520–1555 (2016). https://doi.org/10.1007/s10618-016-0453-2

    Article  MathSciNet  Google Scholar 

  23. Wells, J.R., Ting, K.M.: A new simple and efficient density estimator that enables fast systematic search. Pattern Recogn. Lett. 122, 92–98 (2019)

    Article  Google Scholar 

  24. Xu, H., et al.: Beyond outlier detection: outlier interpretation by attention-guided triplet deviation network. In: Proceedings of the Web Conference 2021, pp. 1328–1339 (2021)

    Google Scholar 

  25. Zhang, J., Lou, M., Ling, T.W., Wang, H.: Hos-miner: a system for detecting outlying subspaces of high-dimensional data. In: Proceedings of the 30th International Conference on Very Large Data Bases (VLDB’04), pp. 1265–1268. Morgan Kaufmann Publishers Inc. (2004)

    Google Scholar 

Download references

Acknowledgements

Stefan Lüdtke acknowledges the financial support by the Federal Ministry of Education and Research of Germany and by the Sächsische Staatsministerium für Wissenschaft Kultur und Tourismus in the program Center of Excellence for AI-research “Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig”, project identification number: ScaDS.AI

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Lüdtke .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 172 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lüdtke, S., Bartelt, C., Stuckenschmidt, H. (2023). Outlying Aspect Mining via Sum-Product Networks. In: Kashima, H., Ide, T., Peng, WC. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2023. Lecture Notes in Computer Science(), vol 13935. Springer, Cham. https://doi.org/10.1007/978-3-031-33374-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33374-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33373-6

  • Online ISBN: 978-3-031-33374-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics