An Optimization Model for Outlier Detection in Categorical Data

He, Zengyou; Xu, Xiaofei; Deng, Shengchun

Computer Science > Databases

arXiv:cs/0503081 (cs)

[Submitted on 29 Mar 2005]

Title:An Optimization Model for Outlier Detection in Categorical Data

Authors:Zengyou He, Xiaofei Xu, Shengchun Deng

View PDF

Abstract: The task of outlier detection is to find small groups of data objects that are exceptional when compared with rest large amount of data. Detection of such outliers is important for many applications such as fraud detection and customer migration. Most existing methods are designed for numeric data. They will encounter problems with real-life applications that contain categorical data. In this paper, we formally define the problem of outlier detection in categorical data as an optimization problem from a global viewpoint. Moreover, we present a local-search heuristic based algorithm for efficiently finding feasible solutions. Experimental results on real datasets and large synthetic datasets demonstrate the superiority of our model and algorithm.

Comments:	12 pages
Subjects:	Databases (cs.DB); Artificial Intelligence (cs.AI)
Report number:	Tr-05-0329
Cite as:	arXiv:cs/0503081 [cs.DB]
	(or arXiv:cs/0503081v1 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.cs/0503081

Submission history

From: Zengyou He [view email]
[v1] Tue, 29 Mar 2005 13:31:01 UTC (157 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.DB

< prev | next >

new | recent | 2005-03

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zengyou He
Xiaofei Xu
Shengchun Deng

export BibTeX citation

Computer Science > Databases

Title:An Optimization Model for Outlier Detection in Categorical Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:An Optimization Model for Outlier Detection in Categorical Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators