Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Information gathering in the World-Wide Web: the W3QL query language and the W3QS system

Published: 01 December 1998 Publication History
  • Get Citation Alerts
  • Abstract

    The World Wide Web (WWW) is a fast growing global information resource. It contains an enormous amount of information and provides access to a variety of services. Since there is no central control and very few standards of information organization or service offering, searching for information and services is a widely recognized problem. To some degree this problem is solved by “search services,” also known as “indexers,” such as Lycos, AltaVista, Yahoo, and others. These sites employ search engines known as “robots” or “knowbots” that scan the network periodically and form text-based indices. These services are limited in certain important aspects. First, the structural information, namely, the organization of the document into parts pointing to each other, is usually lost. Second, one is limited by the kind of textual analysis provided by the “search service.” Third, search services are incapable of navigating “through” forms. Finally, one cannot prescribe a complex database-like search. We view the WWW as a huge database. We have designed a high-level SQL-like language called W3QL to support effective and flexible query processing, which addresses the structure and content of WWW nodes and their varied sorts of data. We have implemented a system called W3QS to execute W3QL queries. In W3QS, query results are declaratively specified and continuously maintained as views when desired. The current architecture of W3QS provides a server that enables users to pose queries as well as integrate their own data analysis tools. The system and its query language set a framework for the development of database-like tools over the WWW. A significant contribution of this article is in formalizing the WWW and query processing over it.

    References

    [1]
    ABITEBOUL, S., CLUET, S., AND MILO, T. 1993. Querying and updating the file. In Proceedings of the Nineteenth VLDB Conference (Dublin, Ireland), 73-84.]]
    [2]
    ALTAVISTA HOME PAGE 1996. Available at http://altavista.digital.com/.]]
    [3]
    BECK, S. 1995. Perl module for date manipulations, Date::Manip. Available at http:// www.perl.com.]]
    [4]
    BEERI, C. AND KORNATZKY, Y. 1990. A logical query language for hypertext systems. In Proceedings of the European Conference on Hypertext (Versailles, France). Cambridge University Press, New York, 67-80.]]
    [5]
    BERNERS-LEE, T. 1994. RFC 1738: Uniform resource locators. Available at http:// www.w3.org/hypertext/WWW/Addressing/rfc 1738.txt, December.]]
    [6]
    BUSH, V. 1945. As we may think. The Atlantic Monthly, July.]]
    [7]
    CONSENS, M. P. AND MENDELZON, A. O. 1989. Expressing structural hypertext queries in graphlog. In Proceedings of Hypertext'89 (Pittsburgh, Pa.). 269-292.]]
    [8]
    DE BRA, P. M. E. AND POST, R. D.J. 1994. Searching for arbitrary information in the WWW: The Fish search for Mosaic. In Electronic Proceedings of the Second World Wide Web Conference '94: Mosaic and the Web (Chicago, Ill.). Available at http://www.ncsa.uiuc.edu/ SDG/IT94/Proceedings-/WWW2mProceedings.html.]]
    [9]
    FIELDING, R., FRYSTYK, H., AND BERNERS-LEE, T. 1996. Internet draft: Hypertext transfer protocol, available at HTTP/1.1. http://www.w3.org/hypertext/WWW/Protocols/HTTP/1.1/ spec.html, January.]]
    [10]
    GRAHAM, I.S. 1994. HTML--documentation and style guide. Available at http://www.utirc.utoronto.ca/HTMLdocs/NewHTML/htmlindex.html.]]
    [11]
    GROBE, M. 1994. An instantaneous introduction to CGI scripts and HTML forms. Available at http://kuhttp.cc.ukans.edu/info/forms/forms-intro.html.]]
    [12]
    HALASZ, F. G. 1988. Reflections on notecards: Seven issues for the next generation of hypermedia systems. Commun. ACM 31, 7, 836-852.]]
    [13]
    JAVA LANGUAGE HOME PAGE. 1995. Available at http://java.sun.com.]]
    [14]
    JOHNSON, G. 1996. Learn Perl in two hours. Available at http://www.phlab.missouri.edu/ perl/perlcourse.html, Campus Computing, University of Missouri--Columbia.]]
    [15]
    KONOPNICKI, D. AND SHMUELI, O. 1995. W3QS: A query system for the world-wide web. In Proceedings of the 21st VLDB Conference, (Zurich), 54-65.]]
    [16]
    LAKSHMANAN, L. V. S., SADRI, F., AND SUBRAMANIA, I.N. 1996. A declarative language for querying and restructuring the web. In Proceedings of the Sixth International Workshop on Research Issues in Data Engineering--Interoperability of Nontraditional Database Systems (New Orleans, La.). 12-21.]]
    [17]
    LYCOS 1995. Available at http://www.lycos.com/.]]
    [18]
    MCBRYAN, O.A. 1994. GENVL and WWWW: Tools for taming the web. In Proceedings of the First International World Wide Web Conference (Geneva, Switzerland, May), 15.]]
    [19]
    METACRAWLER HOME PAGE. 1996. Available at http://www.metacrawler.com.]]
    [20]
    MIHAILA, G. A. 1996. WebSQL--An SQL-like query language for the world-wide web. Master's Thesis, Department of Computer Science, University of Toronto.]]
    [21]
    MINOHARA, T. AND WANATABE, R. 1993. Queries on structure in hypertext. In Foundation of Data Organization and Algorithms, FODO'93 (Chicago, Ill.) 394-411.]]
    [22]
    NETSCAPE-NET SEARCH 1994. Available at http://home.netscape.com/home/internetsearch.html.]]
    [23]
    PINKERTON, B. 1994. Finding what people want: Experiences with the WebCrawler. In Electronic Proceedings of the Second World Wide Web Conference '94: Mosaic and the Web (Chicago, Ill.). Available at http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/WWW2mProceedings.html, 1994.]]
    [24]
    VROMANS, J. AND DESIGN, S. 1983. Programming Perl, Quick Reference Guide.]]
    [25]
    YAHOO HOME PAGE 1994. http://www.yahoo.com.]]

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Database Systems
    ACM Transactions on Database Systems  Volume 23, Issue 4
    Dec. 1998
    122 pages
    ISSN:0362-5915
    EISSN:1557-4644
    DOI:10.1145/296854
    • Editor:
    • Won Kim
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 December 1998
    Published in TODS Volume 23, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CGI
    2. FORMS
    3. HTML
    4. HTTP
    5. PERL
    6. World-Wide Web
    7. query language
    8. query system

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)59
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Distributed Subweb Specifications for Traversing the WebTheory and Practice of Logic Programming10.1017/S1471068423000054(1-27)Online publication date: 25-Apr-2023
    • (2023)Link Traversal Query Processing Over Decentralized Environments with Structural AssumptionsThe Semantic Web – ISWC 202310.1007/978-3-031-47240-4_1(3-22)Online publication date: 6-Nov-2023
    • (2017)SPARQL with property paths on the WebSemantic Web10.3233/SW-1602378:6(773-795)Online publication date: 1-Jan-2017
    • (2016)LDQL: A query language for the Web of Linked DataWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2016.10.00141(9-29)Online publication date: Dec-2016
    • (2015)SQL and Data Analysis. Some Implications for Data Analysits and Higher EducationProcedia Economics and Finance10.1016/S2212-5671(15)00071-420(243-251)Online publication date: 2015
    • (2015)A Context-Based Semantics for SPARQL Property Paths Over the WebProceedings of the 12th European Semantic Web Conference on The Semantic Web. Latest Advances and New Domains - Volume 908810.1007/978-3-319-18818-8_5(71-87)Online publication date: 31-May-2015
    • (2013)Supporting application development with structured queries in the cloudProceedings of the 2013 International Conference on Software Engineering10.5555/2486788.2486965(1213-1216)Online publication date: 18-May-2013
    • (2013)Supporting application development with structured queries in the cloud2013 35th International Conference on Software Engineering (ICSE)10.1109/ICSE.2013.6606681(1213-1216)Online publication date: May-2013
    • (2008)Integrating Structure in the Probabilistic Model for Information RetrievalProceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 0110.1109/WIIAT.2008.346(763-769)Online publication date: 9-Dec-2008
    • (2006)A logical foundation for nested semi‐structured data and web formsInternational Journal of Web Information Systems10.1108/174400806800000972:1(3-18)Online publication date: Feb-2006
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media