Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1871437.1871713acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Relevance-index size tradeoff in contextual advertising

Published: 26 October 2010 Publication History

Abstract

In Contextual advertising, textual ads relevant to the content in a webpage are embedded in the page. Content keywords are extracted offline by crawling webpages and then stored in an index for fast serving. Given a page, ad selection involves index lookup, computing similarity between the keywords of the page and those of candidate ads and returning the top-k scoring ads. In this approach, there is a tradeoff between relevance and index size where better relevance can be achieved if there are no limits on the index size. However, the assumption of unlimited index size is not practical due to the large number of pages on the Web and stringent requirements on the serving latency. Secondly, page visits on the web follows power-law distribution where a significant proportion of the pages are visited infrequently, also called the tail pages. Indexing tail pages is not efficient given that these pages are accessed very infrequently.
We propose a novel mechanism to mitigate these problems in the same framework. The basic idea is to index the same keyword vector for a set of similar pages. The scheme involves learning a website specific hierarchy from (page, URL) pairs of the website. Next, keywords are populated on the nodes via bottom-up traversal over the hierarchy. We evaluate our approach on a human labeled dataset where our approach has higher nDCG compared to a recent approach even though the index size of our approach is 7 times less than index size of the recent approach.

References

[1]
Memcached: distributed memory object caching system. http://www.memcached.org/.
[2]
A. Anagnostopoulos, A. Z. Broder, E. Gabrilovich, V. Josifovski, and L. Riedel. Just-in-time contextual advertising. In CIKM'07, pages 331--340, November 2007.
[3]
Z. Bar-Yossef and M. Gurevich. Mining search engine query logs via suggestion sampling. Proc. VLDB Endow., 1(1):54--65, August 2008.
[4]
Z. Bar-Yossef and M. Gurevich. Estimating the impressionrank of web pages. In WWW'09, pages 41--50, April 2009.
[5]
Z. Bar-Yossef, I. Keidar, and U. Schonfeld. Do not crawl in the dust: different urls with similar text. In WWW'07, pages 111--120, May 2007.
[6]
A. Broder, M. Fontoura, V. Josifovski, and L. Riedel. A semantic approach to contextual advertising. In SIGIR'07, pages 559--566, July 2007.
[7]
B. Efron and R. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, 1994.
[8]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002.
[9]
H. S. Koppula, K. P. Leela, A. Agarwal, K. P. Chitrapura, S. Garg, and A. Sasturkar. Learning url patterns for webpage de-duplication. In WSDM'10, pages 381--390, February 2010.
[10]
V. Murdock, M. Ciaramita, and V. Plachouras. A noisy-channel approach to contextual advertising. In ADKDD'07, pages 21--27, August 2007.
[11]
S. Pandey, A. Broder, F. Chierichetti, V. Josifovski, R. Kumar, and S. Vassilvitskii. Nearest-neighbor caching for content-match applications. In WWW'09, pages 441--450, April 2009.
[12]
J. R. Quinlan. Induction of decision trees. Mach. Learn., 1(1):81--106, March 1986.
[13]
B. Ribeiro-Neto, M. Cristo, P. B. Golgher, and E. Silva de Moura. Impedance coupling in content-targeted advertising. In SIGIR'05, pages 496--503, August 2005.
[14]
W.-t. Yih, J. Goodman, and V. R. Carvalho. Finding advertising keywords on web pages. In WWW'06, pages 213--222, May 2006.

Index Terms

  1. Relevance-index size tradeoff in contextual advertising

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
    October 2010
    2036 pages
    ISBN:9781450300995
    DOI:10.1145/1871437
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. contextual advertising
    2. index size
    3. keyword aggregation
    4. relevance

    Qualifiers

    • Poster

    Conference

    CIKM '10

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 168
      Total Downloads
    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Oct 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media