Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Discovering cluster-based local outliers

Published: 01 June 2003 Publication History

Abstract

In this paper, we present a new definition for outlier: cluster-based local outlier, which is meaningful and provides importance to the local data behavior. A measure for identifying the physical significance of an outlier is designed, which is called cluster-based local outlier factor (CBLOF). We also propose the FindCBLOF algorithm for discovering outliers. The experimental results show that our approach outperformed the existing methods on identifying meaningful and interesting outliers.

References

[1]
Aggarwal, C., Yu, P., 2001. Outlier detection for high dimensional data. In: Proceedings of SIGMOD'01, Santa Barbara, CA, USA, pp. 37-46.]]
[2]
Angiulli, F., Pizzuti, C., 2002. Fast outlier detection in high dimensional spaces. In: Proceedings of PKDD'02.]]
[3]
Arning, A., Agrawal, R., Raghavan, P., 1996. A linear method for deviation detection in large databases. In: Proceedings of KDD'96, Portland OR, USA, pp. 164-169.]]
[4]
Barnett, V., Lewis, T., 1994. In: Outliers in Statistical Data. John Wiley and Sons, New York, p. 1994.]]
[5]
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U., 1999. When is "nearest neighbors" meaningful? In: Proceedings of ICDT'99, Jerusalem, Israel, pp. 217-235.]]
[6]
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J., 2000. LOF: Identifying density-based local outliers. In: Proceedings of SIGMOD'00, Dallas, Texas, pp. 427-438.]]
[7]
Ester, M., Kriegel, H.P., Sander, J., Xu, X., 1996. A density-based algorithm for discovering clusters in large spatial databases. In: Proceedings of KDD'96, Portland OR, USA, pp. 226-231.]]
[8]
Guha, S., Rastogi, R., Kyuseok, S., 1999. ROCK: A robust clustering algorithm for categorical attributes. In: Proceedings of ICDE'99, Sydney, Australia, pp. 512-521.]]
[9]
Harkins, S., He, H., Willams, G.J., Baster, R.A., 2002. Outlier detection using replicator neural networks. In: Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery, Aix-en-Provence, France, pp. 170-180.]]
[10]
He, Z., Deng, S., Xu, X., 2002a. Outlier detection integrating semantic knowledge. In: Proceedings of the 3rd International Conference on Web-Age Information Management, Beijing, China, pp. 126-131.]]
[11]
He, Z., Xu, X., Deng, S., 2002b. Squeezer: An efficient algorithm for clustering categorical data. J. Comput. Sci. Technol. 17 (5), 611-624.]]
[12]
Jiang, M.F., Tseng, S.S., Su, C.M., 2001. Two-phase clustering process for outliers detection. Pattern Recognition Lett. 22 (6/7), 691-700.]]
[13]
Knorr, E.M., Ng, R.T., 1998. Algorithms for mining distance-based outliers in large datasets. In: Proceedings of VLDB'98, New York, USA, pp. 392-403.]]
[14]
Liu, B., Hsu, W., Ma, Y., 1998. Integrating classification and association rule mining. In: Proceedings of KDD'98, New York, USA, pp. 80-86.]]
[15]
Merz, C.J., Merphy, P., 1996. UCI repository of machine learning databases. URL: http://www.ics.uci.edu/mlearn/ MLRRepository.html.]]
[16]
Nanopoulos, A., Theodoridis, Y., Manolopoulos, Y., 2001. C2P: Clustering based on closest pairs. In: Proceedings of VLDB'01, Rome Italy, pp. 331-340.]]
[17]
Nuts, R., Rousseeuw, P., 1996. Computing depth contours of bivariate point clouds. J. Comput. Statist. Data Anal. 23, 153-168.]]
[18]
Ramaswamy, S., Rastogi, R., Kyuseok, S., 2000. Efficient algorithms for mining outliers from large data sets. In: Proceedings of SIGMOD'00, Dallas, Texas, pp. 93-104.]]
[19]
Yamanishi, K., Takeuchi, J., 2001. Discovering outlier filtering rules from unlabeled data-combining a supervised learner with an unsupervised learner. In: Proceedings of KDD'01, pp. 389-394.]]
[20]
Yamanishi, K., Takeuchi, J., Williams, G., 2000. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. In: Proceedings of KDD'00, Boston, MA, USA, pp. 320-325.]]
[21]
Yu, D., Sheikholeslami, G., Zhang, A., 1999. FindOut: Finding out outliers in large datasets. Technique Report, State University of New York at Buffalo, 1999.]]

Cited By

View all
  • (2024)A new unsupervised outlier detection methodJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23651846:1(1713-1734)Online publication date: 1-Jan-2024
  • (2024)Outlier detection using conditional information entropy and rough set theoryJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23600946:1(1899-1918)Online publication date: 1-Jan-2024
  • (2024)Robust Multi-Kernel Nearest Neighborhood for Outlier DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336417936:8(4220-4231)Online publication date: 1-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 June 2003

Author Tags

  1. clustering
  2. data mining
  3. outlier detection

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A new unsupervised outlier detection methodJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23651846:1(1713-1734)Online publication date: 1-Jan-2024
  • (2024)Outlier detection using conditional information entropy and rough set theoryJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23600946:1(1899-1918)Online publication date: 1-Jan-2024
  • (2024)Robust Multi-Kernel Nearest Neighborhood for Outlier DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336417936:8(4220-4231)Online publication date: 1-Aug-2024
  • (2024)An Efficient Adaptive Multi-Kernel Learning With Safe Screening Rule for Outlier DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.333070836:8(3656-3669)Online publication date: 1-Aug-2024
  • (2024)Trustworthy semi‐supervised anomaly detection for online‐to‐offline logistics business in merchant identificationCAAI Transactions on Intelligence Technology10.1049/cit2.123019:3(544-556)Online publication date: 14-Apr-2024
  • (2024)Random clustering-based outlier detectorInformation Sciences: an International Journal10.1016/j.ins.2024.120498667:COnline publication date: 1-May-2024
  • (2024)Fusing multi-scale fuzzy information to detect outliersInformation Fusion10.1016/j.inffus.2023.102133103:COnline publication date: 1-Mar-2024
  • (2024)Multi-view Outlier Detection via Graphs DenoisingInformation Fusion10.1016/j.inffus.2023.102012101:COnline publication date: 1-Jan-2024
  • (2024)Exploiting fuzzy rough entropy to detect anomaliesInternational Journal of Approximate Reasoning10.1016/j.ijar.2023.109087165:COnline publication date: 1-Feb-2024
  • (2024)A self-supervised anomaly detection algorithm with interpretabilityExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121539237:PBOnline publication date: 1-Feb-2024
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media