Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/371920.372182acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
Article

IEPAD: information extraction based on pattern discovery

Published: 01 April 2001 Publication History
  • Get Citation Alerts
  • First page of PDF

    References

    [1]
    Chang, C.H.; Lui, S.C.; and Wu, Y.C. Applying pattern mining to Web information extraction. In Proceedings of the Fifth Pacific Asia Conference on Knowledge Discovery and Data Mining, Apr. 2001, Hong Kong.]]
    [2]
    Chien, L.F. PAT-tree-based keyword extraction for Chinese information retrieval. In Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval. pp. 50-58. 1997.]]
    [3]
    Doorenbos, R.B.; Etzioni, O.; and Weld, D. S. A scalable comparison-shopping agent for the World Wide Web. In Proceedings of the first international conference on Autonomous Agents. pp. 39-48, NewYork, NY, 1997, ACM Press.]]
    [4]
    Embley, D.; Jiang, Y.; and Ng, Y. -K. 1999. Recordboundary discovery in Web documents. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data (SIGMOD'99)}. pp. 467-478, Philadelphia, Pennsylvania.]]
    [5]
    Gonnet, G.H.; Baeza-yates, R.A.; and Snider, T. 1992. New Indices for Text: Pat trees and Pat Arrays. Information Retrieval: Data Structures and Algorithms, Prentice Hall.]]
    [6]
    Gusfield, D. 1997. Algorithms on strings, tree, and sequence, Cambridge. 1997.]]
    [7]
    Hsu, C.-N., and Dung, M.-T. 1998. Generating finite-state transducers for semi-structured data extraction from the Web. Information Systems. 23(8): 521-538.]]
    [8]
    Knoblock, A. et al., Eds. 1998. In Proceedings of the 1998 Workshop on AI and Information Integration, Menlo Park, California. AAAI Press.]]
    [9]
    Kurtz, S., and Schleiermacher, C. 1999. REPuter: fast computation of maximal repeats in complete genomes. Bioinformatics 15(5): 426-427.]]
    [10]
    Kushmerick, N. 1999. Gleaning the Web. IEEE Intelligent Systems 14(2): 20-22.]]
    [11]
    Kushmerick, N.; Weld, D.; and Doorenbos, R. 1997. Wrapper induction for information extraction. In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI).]]
    [12]
    Morrison, D. R. Journal of ACM, 15, pp. 514-534, 1968.]]
    [13]
    Muslea, I.; Minton, S.; and Knoblock, C. 1999. A hierarchical approach to wrapper induction. In Proceedings of the 3rd International Conference on Autonomous Agents (Agents '99), Seattle, WA.]]
    [14]
    Muslea, I. 1999. Extraction patterns for information extraction tasks: a survey. In Proceedings of AAAI '99: Workshop on Machine Learning for Information Extraction]]
    [15]
    Sedgewick, R. Algorithms in C, Addison Wesley, 1990.]]

    Cited By

    View all
    • (2023)EDREW - Enhanced Data Representation for Extraction in WebProceedings of the 29th Brazilian Symposium on Multimedia and the Web10.1145/3617023.3617055(230-237)Online publication date: 23-Oct-2023
    • (2022)Web Record Extraction with InvariantsProceedings of the VLDB Endowment10.14778/3574245.357427616:4(959-972)Online publication date: 1-Dec-2022
    • (2021)A Framework for Automated Scraping of Structured Data Records From the Deep Web Using Semantic LabelingInternational Journal of Information Retrieval Research10.4018/IJIRR.29083012:1(1-18)Online publication date: 4-Nov-2021
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '01: Proceedings of the 10th international conference on World Wide Web
    May 2001
    770 pages
    ISBN:1581133480
    DOI:10.1145/371920
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 April 2001

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. PAT tree
    2. extraction rule
    3. information extraction
    4. multiple string alignment

    Qualifiers

    • Article

    Conference

    WWW01
    Sponsor:
    • IW3C2

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)4
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)EDREW - Enhanced Data Representation for Extraction in WebProceedings of the 29th Brazilian Symposium on Multimedia and the Web10.1145/3617023.3617055(230-237)Online publication date: 23-Oct-2023
    • (2022)Web Record Extraction with InvariantsProceedings of the VLDB Endowment10.14778/3574245.357427616:4(959-972)Online publication date: 1-Dec-2022
    • (2021)A Framework for Automated Scraping of Structured Data Records From the Deep Web Using Semantic LabelingInternational Journal of Information Retrieval Research10.4018/IJIRR.29083012:1(1-18)Online publication date: 4-Nov-2021
    • (2021)Trends in web data extraction using machine learningWeb Intelligence10.3233/WEB-210465(1-22)Online publication date: 15-Nov-2021
    • (2021)Web question answering with neurosymbolic program synthesisProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454047(328-343)Online publication date: 19-Jun-2021
    • (2021)A Study on Different Aspects of Web Mining and Research IssuesIOP Conference Series: Materials Science and Engineering10.1088/1757-899X/1022/1/0120181022(012018)Online publication date: 19-Jan-2021
    • (2021)A survey on semi-structured web data manipulations by non-expert usersComputer Science Review10.1016/j.cosrev.2021.10036740(100367)Online publication date: May-2021
    • (2020)The Significance of Blockchain Technology in Digital Transformation of Logistics and TransportationInternational Journal of E-Services and Mobile Applications10.4018/IJESMA.202001010112:1(1-20)Online publication date: 1-Jan-2020
    • (2020)Segmentation and Ranking of Online Reviewer CommunityInternational Journal of E-Adoption10.4018/IJEA.202001010612:1(63-83)Online publication date: 1-Jan-2020
    • (2020)Experimenting Language Identification for Sentiment Analysis of English Punjabi Code Mixed Social Media TextInternational Journal of E-Adoption10.4018/IJEA.202001010512:1(52-62)Online publication date: 1-Jan-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media