Abstract
Nowadays, e-mail is considered one of the most important communication methods, but most users suffer from Spam mail. To solve this problem, there has been much research. The previous research showed comparatively high performance, but for adaptation of real world, it requires several improvements. First, it needs personalized learning for better performance. We cannot make a strict definition of Spam, because the definition of any context depends on each user. Second, the concept drift or interest drift problem, that is, users’ interest or any context’s concept, may change over time. Therefore, many Spam filtering systems are using continuous learning schemes such as adaptive learning or incremental learning. However, these systems require user feedback or rating results manually, and this inconvenience causes slow learning and performance enhancement. In this research, we developed an adaptive learning system based on an automatic weighting environment. For the automatic weight, we categorized 6 user patterns (actions) on the mailing system whose weights are automatically adapted to the learning phase. From the experiment, we will demonstrate the Bayesian classification with an adaptive learning environment. By using suggesting ideas, we will analyze the comparison result with adaptive learning. Finally, from the experiment using real world data sets, we will prove its possibility for tracking the concept and interest drift problems.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Korea Telecom. (2004), http://www.kt.co.kr
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian Approach to Filtering Junk E-Mail. In: Learning for Text Categorization, Proc. of the AAAI Workshop, Madison Wisconsin. AAAI Technical Report WS-98-05, pp. 55–62 (1998)
Thomas, G., Peter, A.F.: Weighted Bayesian Classification based on Support Vector Machine. In: Proc. of the 18th International Conference on Machine Learning, pp. 207–209 (2001)
Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D., Stamatopoulos, P.: A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists. Information Retrieval 6(1), 49–73 (2000)
Androutsopoulos, I., Koutsias, J., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C., Stamatopoulos, P.: Learning to Filter Spam E-mail: A Comparison of a NaĂŻve Bayesian and a Memory-Based Approach. In: 4th PKDD Workshop on Machine Learning and Textual Information Access (2000)
The Apache SpamAssassin Project, http://Spamassassin.apache.org/
The SpamBayes Project, http://Spambayes.sourceforge.net/
Kim, H.J., Kim, H.N., Jung, J.J., Jo, G.S.: Spam mail Filtering System using Semantic Enrichment. In: Proc. of the 5th International Conference on Web Information Systems Engineering (2004)
Cunningham, P., Nowlan, N., Delany, S.J., Haahr, M.: A Case-Based Approach to Spam Filtering that Can Track Concept Drift. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, Springer, Heidelberg (2003)
Kevin, R.G.: Using Latent Semantic Indexing to Filter Spam. In: ACM Symposium on Applied Computing, Data Mining Track (2003)
Cohen, W.W.: Learning Rules that Classify E-Mail. In: Proc. of the AAAI Spring Symposium on Machine Learning in Information Access (1996)
Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Paliouras, G., Spyropoulos, C.D.: An Evaluation of Naive Bayesian Anti-Spam Filtering. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 9–17. Springer, Heidelberg (2000)
Ferreira, J.T.A.S., Denison, D.G.T., Hand, D.J.: Weighted NaĂŻve Bayes modeling for data mining, Technical report, Dept. of mathematics at Imperial College (2001)
Kim, H.J., Kim, H.N., Jung, J.J., Jo, G.S.: On Enhancing The Performance of Spam mail Filtering System using Semantic Enrichment. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, Springer, Heidelberg (2004)
Koychev, I., Schwab, I.: Adaption to Drifting User’s Interests. In: Proc. of the ECML200/MLnet Workshop ML in the New Information Age (2000)
Pádraig, C., Niamh, N., Sarah, J.D., Mads, H.: A Case-Based Approach to Spam Filtering that Can Track Concept Drift. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, Springer, Heidelberg (2003)
Delany, S.J., Cunningham, P., Coyle, L.: An Assessment of Case-Based Reasoning for Spam Filtering. Artificial Intelligence Review Journal 24(3-4), 359–378 (2005)
Mitchell, T., Caruana, R., Freitag, D., McDermott, J., Zabowski, D.: Experience with a Learning Personal Assistant. Communications of the ACM 37(7), 81–91 (1994)
Schlimmer, J., Granger, R.: Incremental Learning from Noisy Data. Machine Learning 1(3), 317–357 (1986)
Grabtree, I., Soltysiak, S.: Identifying and Tracking Changing Interests. International Journal of Digital Libraries 2, 38–53 (1998)
Koychev, I.: Gradual Forgetting for Adaptation to Concept Drift. In: Proc. of ECAI 2000 Workshop Current Issues in Spatio-Temporal Reasoning, pp. 101–106 (2000)
Yang, Y., Liu, X.: A Re-examination of Text Categorization Methods. In: Proc. of the ACM SIGIR 1999 Conference (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, HJ., Shrestha, J., Kim, HN., Jo, GS. (2006). User Action Based Adaptive Learning with Weighted Bayesian Classification for Filtering Spam Mail. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_83
Download citation
DOI: https://doi.org/10.1007/11941439_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)