Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1135777.1135788acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Browsing on small screens: recasting web-page segmentation into an efficient machine learning framework

Published: 23 May 2006 Publication History

Abstract

Fitting enough information from webpages to make browsing on small screens compelling is a challenging task. One approach is to present the user with a thumbnail image of the full web page and allow the user to simply press a single key to zoom into a region (which may then be transcoded into wml/xhtml, summarized, etc). However, if regions for zooming are presented naively, this yields a frustrating experience because of the number of coherent regions, sentences, images, and words that may be inadvertently separated. Here, we cast the web page segmentation problem into a machine learning framework, where we re-examine this task through the lens of entropy reduction and decision tree learning. This yields an efficient and effective page segmentation algorithm. We demonstrate how simple techniques from computer vision can be used to fine-tune the results. The resulting segmentation keeps coherent regions together when tested on a broad set of complex webpages.

References

[1]
Milic-Frayling, N. and Sommerer, R. (2002) "SmartView: Enhanced Document Viewer for Mobile Devices.ö MSR-TR-2002-114 (2002).
[2]
Milic-Frayling, N. and Sommerer, R., Rodden, K., Blackwell, A. (2003) "SearchMobil: Web Viewing and Search for Mobile Devicesö Proc. WWW 2003.
[3]
Wobbrock, J., Forlizzi, J., Hudson, S., Myers, B. (2002) "WebThumb: Interaction Techniques for Small Screen Browsersö. Proc. 15th User Interfaces and Technology (2002).
[4]
Fumas, G. "Generalized Fisheye Viewsö (1986), CHI-86, pp. 16--23.
[5]
Hedman, A., Carr, D., & Nassla, H. (2004) "Browsing Thumbnails: A Comparison of Three Techniquesöö. Proc. 26th International Conference on Information Technology Interfaces.
[6]
Cai, D., Yu, S., Wen, J.R., Ma, W.Y. (2003), "VIPS: A vision-based segmentation algorithmö. MSR-TR-2003-70. Nov. 2003.
[7]
Xie, X., Mia, G., Song, R., Wen, J.R., Ma, W.Y., (2005) "Efficient Browsing of Web Search Results on Mobile Devices Based on Block Importance Modelö, 3rd IEEE Pervasive Comp. & Comm.
[8]
Berwick, B. (2003): Lecture Notes, MIT Class 6.034 AI, Recitation #9 "Nearest Neighbors + ID Treesö, Fall 2003 http://www.ai.mit.edu/courses/6.034b/recitation9.pdf
[9]
Moore, A. (2003): "Information Gainö, Lecture Notes. http://www.autonlab.org/tutorials/
[10]
Loper, E. (2003): "Decision Treesö, Lecture Notes, http://www.cis.upenn.edu/ edloper/slides/
[11]
Woodruff, A., Faulring, A., Rosenholtz, R., Morrison, J., Pirolli, P. (2001) "Using Thumbnails to Search the Webö, CHI-2001. 120--127.
[12]
Lam, H., Baudisch, P. (2005) "Summary Thumbnails: Readable Overviews for Small Screen Web Browsersö, Proceedings of CHI-2005. pp. 681--690.
[13]
Bjork, S., Bretan, I., Danielsson, R., Karlgren, J. (1999), "WEST: A Web Browser for Small Terminals.ö Proc UIST'99. 187--196.
[14]
Buyukkokten, O., Gracia-Molina, H., Paepcke, and Winograd, T. (2000) "Power Browser: Efficient Web Browsing for PDAsö. In Proc. CHI 2000, pp. 430--437.
[15]
Baudisch, P., Lee, B., and Hanna, L. (2004) "Fishnet, a fisheye web browser with search term popouts: a comparative evaluation with overview and linear view.ö In Proc. AVI 2004, pp 133--140.
[16]
Chen, Y., Ma, W.Y., Zhang, H.J. (2003) "Detecting Web Page Structure for Adaptive Viewing on Small Form Factor Devicesö, Proc of the 12th Int. Conf World Wide Web.

Cited By

View all
  • (2024)Predicting eye-tracking assisted web page segmentationMultimedia Tools and Applications10.1007/s11042-024-20202-1Online publication date: 9-Sep-2024
  • (2023)Aesthetics++: Refining Graphic Designs by Exploring Design Principles and Human PreferenceIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.315161729:6(3093-3104)Online publication date: 1-Jun-2023
  • (2022)Multimodal Web Page Segmentation Using Self-organized Multi-objective ClusteringACM Transactions on Information Systems10.1145/348096640:3(1-49)Online publication date: 7-Mar-2022
  • Show More Cited By

Index Terms

  1. Browsing on small screens: recasting web-page segmentation into an efficient machine learning framework

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '06: Proceedings of the 15th international conference on World Wide Web
    May 2006
    1102 pages
    ISBN:1595933239
    DOI:10.1145/1135777
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 May 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. browser
    2. machine learning
    3. mobile browsing
    4. mobile devices
    5. small screen
    6. thumbnail browsing
    7. web page segmentation

    Qualifiers

    • Article

    Conference

    WWW06
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 10 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Predicting eye-tracking assisted web page segmentationMultimedia Tools and Applications10.1007/s11042-024-20202-1Online publication date: 9-Sep-2024
    • (2023)Aesthetics++: Refining Graphic Designs by Exploring Design Principles and Human PreferenceIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.315161729:6(3093-3104)Online publication date: 1-Jun-2023
    • (2022)Multimodal Web Page Segmentation Using Self-organized Multi-objective ClusteringACM Transactions on Information Systems10.1145/348096640:3(1-49)Online publication date: 7-Mar-2022
    • (2022)Banner layout retargeting with hierarchical reinforcement learning and variational autoencoderMultimedia Tools and Applications10.1007/s11042-022-13325-w81:24(34417-34438)Online publication date: 12-Aug-2022
    • (2021)Content-Determined Web Page Segmentation and Navigation for Mobile Web SearchingResult Page Generation for Web Searching10.4018/978-1-7998-0961-6.ch007(88-108)Online publication date: 2021
    • (2021)Attribute-Conditioned Layout GAN for Automatic Graphic DesignIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.299933527:10(4039-4048)Online publication date: 1-Oct-2021
    • (2020)Web Page Segmentation RevisitedProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412782(3047-3054)Online publication date: 19-Oct-2020
    • (2019)Constructing Novel Block Layouts for Webpage AnalysisACM Transactions on Internet Technology10.1145/332645719:3(1-18)Online publication date: 10-Jul-2019
    • (2018)$${{\textsc {ber}}}_{y}{\textsc {l}}$$BERyL: A System for Web Block ClassificationTransactions on Computational Science XXXIII10.1007/978-3-662-58039-4_4(61-78)Online publication date: 16-Sep-2018
    • (2017)A novel algorithm for extracting the user reviews from web pagesJournal of Information Science10.1177/016555151666644643:5(696-712)Online publication date: 1-Oct-2017
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media