emailsift: Email classification based on structure and content

M Aery, S Chakravarthy - … Conference on Data Mining (ICDM'05 …, 2005 - ieeexplore.ieee.org
Fifth IEEE International Conference on Data Mining (ICDM'05), 2005ieeexplore.ieee.org
In this paper we propose a novel approach that uses structure as well as the content of
emails in a folder for email classification. Our approach is based on the premise that
representative-common and recurring-structures/patterns can be extracted from a pre-
classified email folder and the same can be used effectively for classifying incoming emails.
A number of factors that influence representative structure extraction and the classification
are analyzed conceptually and validated experimentally. In our approach, the notion of …
In this paper we propose a novel approach that uses structure as well as the content of emails in a folder for email classification. Our approach is based on the premise that representative - common and recurring -structures/patterns can be extracted from a pre-classified email folder and the same can be used effectively for classifying incoming emails. A number of factors that influence representative structure extraction and the classification are analyzed conceptually and validated experimentally. In our approach, the notion of inexact graph match is leveraged for deriving structures that provide coverage for characterizing folder contents. Extensive experimentation validate the selection of parameters and the effectiveness of our approach for email classification.
ieeexplore.ieee.org