Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/645941.674192guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Outlier Detection Integrating Semantic Knowledge

Published: 11 August 2002 Publication History

Abstract

Existing proposals on outlier detection didn't take the semantic knowledge of the dataset into consideration. They only tried to find outliers from dataset itself, which prevents finding more meaningful outliers. In this paper, we consider the problem of outlier detection integrating semantic knowledge. We introduce new definition for outlier: semantic outlier. A semantic outlier is a data point, which behaves differently with other data points in the same class. A measure for identifying the degree of each object being an outlier is presented, which is called semantic outlier factor (SOF). An efficient algorithm for mining semantic outliers based on SOF is also proposed. Experimental results show that meaningful and interesting outliers can be found with our method.

References

[1]
E. M. Knorr, R. T. Ng: Algorithms for Mining Distance-Based Outliers in Large Datasets. Proc. 24th Int. Conf. on Very Large Database, New York, NY, 1998, pp. 392-403.
[2]
S. Ramaswamy, R. Rastogi, S. Kyuseok: Efficient Algorithms for Mining Outliers from Large Data Sets. Proc. ACM SIGMOD 2000 Int. Conf. on Management of Data , Dallas, Texas, 2000.
[3]
M. M. Breunig, H. P. Kriegel, R. T. Ng, J. Sander: LOF: Identifying Density-Based Local Outliers". Proc. ACM SIGMOD 2000 Int. Conf. on Management of Data, Dallas, Texas, 2000.
[4]
C. Aggarwal, P. Yu: Outlier Detection for High Dimensional Data. Proc. of the 2001 ACM SIGMOD Int'1 Conf. Management of Data, pp. 37-46, Santa Barbara, CA, USA.
[5]
Z. He, S. Deng and X. Xu: Squeezer : An Efficient Algorithm for Clustering Categorical Data. Technical Report, HIT, 2001. http://202.118.239.67/tech/squeezer.pdf. To appear in Journal of Computer Science and Technology.
[6]
C. J. Merz, Murphy: UCI Repository of Machine Learning Databases. (Http://www.ics.uci.edu/~mlearn/MLRRepository.html).

Cited By

View all
  • (2015)Mining meaningful outlier using rough-negative association algorithm in heartdisease datasetProceedings of the 9th International Conference on Ubiquitous Information Management and Communication10.1145/2701126.2701182(1-7)Online publication date: 8-Jan-2015
  • (2012)Detection of Outlier Residues for Improving Interface Prediction in Protein HeterocomplexesIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2012.589:4(1155-1165)Online publication date: 1-Jul-2012
  • (2009)Anomaly detectionACM Computing Surveys10.1145/1541880.154188241:3(1-58)Online publication date: 30-Jul-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
WAIM '02: Proceedings of the Third International Conference on Advances in Web-Age Information Management
August 2002
443 pages
ISBN:3540440453

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 August 2002

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2015)Mining meaningful outlier using rough-negative association algorithm in heartdisease datasetProceedings of the 9th International Conference on Ubiquitous Information Management and Communication10.1145/2701126.2701182(1-7)Online publication date: 8-Jan-2015
  • (2012)Detection of Outlier Residues for Improving Interface Prediction in Protein HeterocomplexesIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2012.589:4(1155-1165)Online publication date: 1-Jul-2012
  • (2009)Anomaly detectionACM Computing Surveys10.1145/1541880.154188241:3(1-58)Online publication date: 30-Jul-2009
  • (2006)An approach based on wavelet analysis and non-linear mapping to detect anomalies in datasetProceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery10.1007/11881599_64(545-548)Online publication date: 24-Sep-2006
  • (2006)A fast greedy algorithm for outlier miningProceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining10.1007/11731139_67(567-576)Online publication date: 9-Apr-2006
  • (2005)An optimization model for outlier detection in categorical dataProceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I10.1007/11538059_42(400-409)Online publication date: 23-Aug-2005
  • (2005)Collusion set detection through outlier discoveryProceedings of the 2005 IEEE international conference on Intelligence and Security Informatics10.1007/11427995_1(1-13)Online publication date: 19-May-2005
  • (2003)Discovering cluster-based local outliersPattern Recognition Letters10.1016/S0167-8655(03)00003-524:9-10(1641-1650)Online publication date: 1-Jun-2003

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media