Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2661829.2661840acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
demonstration

WiiCluster: a Platform for Wikipedia Infobox Generation

Published: 03 November 2014 Publication History
  • Get Citation Alerts
  • Abstract

    Wikipedia has become one of the best sources for creating and sharing a massive volume of human knowledge. Much effort has been devoted to generating and enriching the structured data by automatic information extraction from unstructured text in Wikipedia. Most, if not all, of the existing work share the same paradigm, that is, starting with information extraction over the unstructured text data, followed by supervised machine learning. Although remarkable progresses have been made, this paradigm has its own limitations in terms of effectiveness, scalability as well as the high labeling cost.
    We present WiiCluster, a scalable platform for automatically generating infobox for articles in Wikipedia. The heart of our system is an effective cluster-then-label algorithm over a rich set of semi-structured data in Wikipedia articles: linked entities. It is totally unsupervised and thus does not require any human label. It is effective in generating semantically meaningful summarization for Wikipedia articles. We further propose a cluster-reuse algorithm to scale up our system. Overall, our WiiCluster is able to generate nearly 10 million new facts. We also develop a web-based platform to demonstrate WiiCluster, which enables the users to access and browse the generated knowledge.

    References

    [1]
    G. Hamerly and C. Elkan. Learning the k in k-means. In Proc. of 17th NIPS, 2003.
    [2]
    D. P. Nguyen, Y. Matsuo, and M. Ishizuka. Exploiting syntactic and semantic information for relation extraction from wikipedia. IJCAI07-TextLinkWS, 2007.
    [3]
    A. Sultana, Q. M. Hasan, A. K. Biswas, S. Das, H. Rahman, C. Ding, and C. Li. Infobox suggestion for wikipedia entities. In Proc. of CIKM, 2012.
    [4]
    K. Zhang, Y. Xiao, H. Tong, H. Wang, and W. Wang. The links have it: Infobox generation by summarization over linked entities. arXiv:1406.6449, 2014.

    Cited By

    View all
    • (2024)Box2Go: Collaborative Interactive Infobox FillingCompanion Proceedings of the ACM on Web Conference 202410.1145/3589335.3651235(1003-1006)Online publication date: 13-May-2024
    • (2023)State-of-the-art approach to extractive text summarization: a comprehensive reviewMultimedia Tools and Applications10.1007/s11042-023-14613-982:19(29135-29197)Online publication date: 16-Feb-2023
    • (2016)An overview of Text Summarization techniques2016 International Conference on Computing Communication Control and automation (ICCUBEA)10.1109/ICCUBEA.2016.7860024(1-7)Online publication date: Aug-2016

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
    November 2014
    2152 pages
    ISBN:9781450325981
    DOI:10.1145/2661829
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 November 2014

    Check for updates

    Author Tags

    1. cluster visualization
    2. knowledge extraction
    3. summarization

    Qualifiers

    • Demonstration

    Funding Sources

    Conference

    CIKM '14
    Sponsor:

    Acceptance Rates

    CIKM '14 Paper Acceptance Rate 175 of 838 submissions, 21%;
    Overall Acceptance Rate 1,690 of 7,572 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)1
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Box2Go: Collaborative Interactive Infobox FillingCompanion Proceedings of the ACM on Web Conference 202410.1145/3589335.3651235(1003-1006)Online publication date: 13-May-2024
    • (2023)State-of-the-art approach to extractive text summarization: a comprehensive reviewMultimedia Tools and Applications10.1007/s11042-023-14613-982:19(29135-29197)Online publication date: 16-Feb-2023
    • (2016)An overview of Text Summarization techniques2016 International Conference on Computing Communication Control and automation (ICCUBEA)10.1109/ICCUBEA.2016.7860024(1-7)Online publication date: Aug-2016

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media