Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3308558.3313629acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

TableNet: An Approach for Determining Fine-grained Relations for Wikipedia Tables

Published: 13 May 2019 Publication History
  • Get Citation Alerts
  • Abstract

    We focus on the problem of interlinking Wikipedia tables with fine-grained table relations: equivalent and subPartOf. Such relations allow us to harness semantically related information by accessing related tables or facts therein. Determining the type of a relation is not trivial. Relations are dependent on the schemas, the cell-values, and the semantic overlap of the cell values in tables.
    We propose TableNet, an approach for interlinking tables with subPartOf and equivalent relations. TableNet consists of two main steps: (i) for any source table we provide an efficient algorithm to find candidate related tables with high coverage, and (ii) a neural based approach that based on the table schemas and data, determines with high accuracy the fine-grained relation.
    Based on an extensive evaluation with more than 3.2M tables, we show that TableNet retains more than 88% of relevant tables pairs, and assigns table relations with an accuracy of 90%.

    References

    [1]
    Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, and Gerhard Weikum. 2018. Never-Ending Learning for Open-Domain Question Answering over Knowledge Bases. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW 2018, Lyon, France, April 23-27, 2018. 1053-1062.
    [2]
    Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC 2007, Busan, Korea, November 11-15, 2007.722-735.
    [3]
    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).
    [4]
    Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5-32.
    [5]
    Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wang, Eugene Wu, and Yang Zhang. 2008. WebTables: exploring the power of tables on the web. PVLDB 1, 1 (2008), 538-549. http://www.vldb.org/pvldb/1/1453916.pdf
    [6]
    Hector Gonzalez, Alon Y. Halevy, Christian S. Jensen, Anno Langen, Jayant Madhavan, Rebecca Shapley, Warren Shen, and Jonathan Goldberg-Kidon. 2010. Google fusion tables: web-centered data management and collaboration. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6-10, 2010. 1061-1066.
    [7]
    Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. 855-864.
    [8]
    Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735-1780.
    [9]
    Quoc V. Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014. 1188-1196. http://jmlr.org/proceedings/papers/v32/le14.html
    [10]
    Emir Mu&ntiled;oz, Aidan Hogan, and Alessandra Mileo. 2014. Using linked data to mine RDF from wikipedia's tables. In Seventh ACM International Conference on Web Search and Data Mining, WSDM 2014, New York, NY, USA, February 24-28, 2014. 533-542.
    [11]
    Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL. 1532-1543. http://aclweb.org/anthology/D/D14/D14-1162.pdf
    [12]
    Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomas Kocisky, and Phil Blunsom. 2016. Reasoning about Entailment with Neural Attention. In International Conference on Learning Representations (ICLR).
    [13]
    Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Y. Halevy, Hongrae Lee, Fei Wu, Reynold Xin, and Cong Yu. 2012. Finding related tables. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20-24, 2012. 817-828.
    [14]
    Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8-12, 2007. 697-706.
    [15]
    Petros Venetis, Alon Y. Halevy, Jayant Madhavan, Marius Pasca, Warren Shen, Fei Wu, Gengxin Miao, and Chung Wu. 2011. Recovering Semantics of Tables on the Web. PVLDB 4, 9 (2011), 528-538. http://www.vldb.org/pvldb/vol4/p528-venetis.pdf

    Cited By

    View all
    • (2024)Large Language Models for Tabular Data: Progresses and Future DirectionsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661384(2997-3000)Online publication date: 10-Jul-2024
    • (2023)RECA: Related Tables Enhanced Column Semantic Type Annotation FrameworkProceedings of the VLDB Endowment10.14778/3583140.358314916:6(1319-1331)Online publication date: 1-Feb-2023
    • (2023)Catch: Collaborative Feature Set Search for Automated Feature EngineeringProceedings of the ACM Web Conference 202310.1145/3543507.3583527(1886-1896)Online publication date: 30-Apr-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • IW3C2: International World Wide Web Conference Committee

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WWW '19
    WWW '19: The Web Conference
    May 13 - 17, 2019
    CA, San Francisco, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Large Language Models for Tabular Data: Progresses and Future DirectionsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661384(2997-3000)Online publication date: 10-Jul-2024
    • (2023)RECA: Related Tables Enhanced Column Semantic Type Annotation FrameworkProceedings of the VLDB Endowment10.14778/3583140.358314916:6(1319-1331)Online publication date: 1-Feb-2023
    • (2023)Catch: Collaborative Feature Set Search for Automated Feature EngineeringProceedings of the ACM Web Conference 202310.1145/3543507.3583527(1886-1896)Online publication date: 30-Apr-2023
    • (2022)Question Answering for the Curated WebundefinedOnline publication date: 10-Mar-2022
    • (2021)DG‐based SPO tuple recognition using self‐attention M‐Bi‐LSTMETRI Journal10.4218/etrij.2020-046044:3(438-449)Online publication date: 29-Nov-2021
    • (2021)Case Studies on the Motivation and Performance of Contributors Who Verify and Maintain In-Flux Tabular DatasetsProceedings of the ACM on Human-Computer Interaction10.1145/34795925:CSCW2(1-25)Online publication date: 18-Oct-2021
    • (2021)TUTAProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467434(1780-1790)Online publication date: 14-Aug-2021
    • (2021)Numerical Formula Recognition from TablesProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467425(1986-1996)Online publication date: 14-Aug-2021
    • (2021)TCN: Table Convolutional Network for Web Table InterpretationProceedings of the Web Conference 202110.1145/3442381.3450090(4020-4032)Online publication date: 19-Apr-2021
    • (2021)Information Extraction From Co-Occurring Similar EntitiesProceedings of the Web Conference 202110.1145/3442381.3449836(3999-4009)Online publication date: 19-Apr-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media