Abstract
Contextual advertising is an important part of today’s Web. It provides benefits to all parties: Web site owners and an advertising platform share the revenue, advertisers receive new customers, and Web site visitors get useful reference links. The relevance of selected ads for a Web page is essential for the whole system to work. Problems such as homonymy and polysemy, low intersection of keywords and context mismatch can lead to the selection of irrelevant ads. Therefore, a simple keyword matching technique gives a poor accuracy. In this paper, we propose a method for improving the relevance of contextual ads. We propose a novel “Wikipedia matching” technique that uses Wikipedia articles as “reference points” for ads selection. We show how to combine our new method with existing solutions in order to increase the overall performance. An experimental evaluation based on a set of real ads and a set of pages from news Web sites is conducted. Test results show that our proposed method performs better than existing matching strategies and using the Wikipedia matching in combination with existing approaches provides up to 50% lift in the average precision. TREC standard measure bpref-10 also confirms the positive effect of using Wikipedia matching for the effective ads selection.
Similar content being viewed by others
References
Anagnostopoulos, A., Broder, A., Gabrilovich, E., Josifovski, V., Riedel, L.: Just-in-time contextual advertising. In: Proc. of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 331–340, Lisbon, Portugal (2007) L
Anthony H.J.: Probability and Statistics for Engineers and Scientists. Duxbury, Belmont (2007)
Broder, A., Fontoura, M., Josifovski, V., Riedel, L.: Semantic approach to contextual advertising. In: Proc. of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands (2007)
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR ’04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 25–32, New York, NY, USA, ACM (2004)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 491–407 (1988)
Ding, C., He, X.: K-means clustering via principal component analysis. In: ICML ’04: Proceedings of the Twenty-first International Conference on Machine Learning, p. 29, New York, NY, USA, ACM (2004)
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: WWW ’01: Proceedings of the 10th International Conference on World Wide Web, pp. 613–622, New York, NY, USA, ACM (2001)
IDC.: Worldwide and U.S. Internet ad Spend Report 4q08: U.S. Growth Flat, 1q09 Spending Likely to Contract (2009)
Jun, Z.: Comprehensive Perl Archive Network. http://search.cpan.org/~jzhang/html-contentextractor-0.02/lib/html/contentex%tractor.pm (2007)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Murdock, V., Ciaramita, M., Plachouras, V.: A noisy-channel approach to contextual advertising. In: Proc. of the 1st International Workshop on Data Mining and Audience Intelligence for Advertising, pp. 21–27, San Jose, California (2007)
Porter, M.F.: An algorithm for suffix stripping. Readings in Information Retrieval, pp. 313–316 (1997)
Porter, M.F.: The Porter Stemming Algorithm official home page. http://tartarus.org/~martin/porterstemmer/index.html (2006)
Ribeiro-Neto, B., Cristo, M.: Impedance coupling in content-targeted advertising. In: Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 496–503, Salvador, Brazil (2005)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM, 18(11), 613–620 (1975)
Sullivan, D.: Search Engine Watch. http://searchenginewatch.com/2183531 (2003)
TREC.: The Fifteenth Text Retrieval Conference (TREC 2006) Proceedings. http://trec.nist.gov/pubs/trec15/appendices/ce.measures06.pdf (2006)
Zhang, Y., Vogel, S.: Measuring confidence intervals for the machine translation evaluation metrics. In: In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, TMI-2004, pp. 4–6 (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pak, A.N., Chung, CW. A Wikipedia Matching Approach to Contextual Advertising. World Wide Web 13, 251–274 (2010). https://doi.org/10.1007/s11280-010-0084-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-010-0084-2