Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3411764.3445227acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

KTabulator: Interactive Ad hoc Table Creation using Knowledge Graphs

Published: 07 May 2021 Publication History
  • Get Citation Alerts
  • Abstract

    The need to find or construct tables arises routinely to accomplish many tasks in everyday life, as a table is a common format for organizing data. However, when relevant data is found on the web, it is often scattered across multiple tables on different web pages, requiring tedious manual searching and copy-pasting to collect data. We propose KTabulator, an interactive system to effectively extract, build, or extend ad hoc tables from large corpora, by leveraging their computerized structures in the form of knowledge graphs. We developed and evaluated KTabulator using Wikipedia and its knowledge graph DBpedia as our testbed. Starting from an entity or an existing table, KTabulator allows users to extend their tables by finding relevant entities, their properties, and other relevant tables, while providing meaningful suggestions and guidance. The results of a user study indicate the usefulness and efficiency of KTabulator in ad hoc table creation.

    Supplementary Material

    VTT File (3411764.3445227_videofigurecaptions.vtt)
    VTT File (3411764.3445227_videopreviewcaptions.vtt)
    MP4 File (3411764.3445227_videofigure.mp4)
    Supplemental video
    MP4 File (3411764.3445227_videopreview.mp4)
    Preview video

    References

    [1]
    2004. Resource Description Framework (RDF): Concepts and Abstract Syntax. https://www.w3.org/TR/rdf-concepts/.
    [2]
    2008. About: The Dark Knight (Film). http://dbpedia.org/page/The_Dark_Knight_(film).
    [3]
    2013. OpenRefine. https://openrefine.org/.
    [4]
    2013. SPARQL Query Language for RDF. https://www.w3.org/TR/rdf-sparql-query/.
    [5]
    2020. Baidu Baike. https://baike.baidu.com/.
    [6]
    2020. What is a Knowledge Graph?https://www.ontotext.com/knowledgehub/fundamentals/what-is-a-knowledge-graph/.
    [7]
    2020. Wikipedia: The Free Encyclopedia. https://www.wikipedia.org/.
    [8]
    Faheem Abbas, Muhammad Kamran Malik, Muhammad Umair Rashid, and Rizwan Zafar. 2016. WikiQA—A question answering system on Wikipedia using freebase, DBpedia and Infobox. In 2016 Sixth International Conference on Innovative Computing Technology (INTECH).
    [9]
    Nafisa Anzum, Semih Salihoglu, and Daniel Vogel. 2019. GraphWrangler: An Interactive Graph View on Relational Data. In Proceedings of the 2019 International Conference on Management of Data.
    [10]
    Maurizio Atzori, Giuseppe M Mazzeo, and Carlo Zaniolo. 2019. QA3: A natural language approach to question answering over RDF data cubes. Semantic Web 10, 3 (2019).
    [11]
    Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The Semantic Web.
    [12]
    Tim Berners-Lee, Yuhsin Chen, Lydia Chilton, Dan Connolly, Ruth Dhanaraj, James Hollenbach, Adam Lerer, and David Sheets. 2006. Tabulator: Exploring and analyzing linked data on the semantic web. In Proceedings of the 3rd International Semantic Web User Interaction Workshop.
    [13]
    Tim Berners-Lee, James Hendler, and Ora Lassila. 2001. The semantic web. Scientific american 284, 5 (2001).
    [14]
    Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data.
    [15]
    Sören Brunk and Philipp Heim. 2011. tFacet: Hierarchical Faceted Exploration of Semantic Data Using Well-Known Interaction Concepts. In DCI@ INTERACT.
    [16]
    Diego Valerio Camarda, Silvia Mazzini, and Alessandro Antonuccio. 2012. LodLive, exploring the web of data. In Proceedings of the 8th International Conference on Semantic Systems.
    [17]
    Wanyun Cui, Yanghua Xiao, Haixun Wang, Yangqiu Song, Seung-won Hwang, and Wei Wang. 2019. KBQA: learning question answering over QA corpora and knowledge bases. arXiv preprint arXiv:1903.02419(2019).
    [18]
    Alessio De Santo and Adrian Holzer. 2020. Interacting with Linked Data: A Survey from the SIGCHI Perspective. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems.
    [19]
    Marie Destandau, Caroline Appert, and Emmanuel Pietriga. 2020. S-Paths: Set-based visual exploration of linked data driven by semantic paths. Semantic WebPreprint(2020), 1–18.
    [20]
    Bruno Dumas, Tim Broché, Lode Hoste, and Beat Signer. 2012. Vidax: An interactive semantic data visualisation and exploration tool. In Proceedings of the International Working Conference on Advanced Visual Interfaces.
    [21]
    Sébastien Ferré. 2017. Sparklis: an expressive query builder for SPARQL endpoints with guidance in natural language. Semantic Web 8, 3 (2017).
    [22]
    Florian Haag and Thomas Ertl. 2014. Filter dials: combine filter criteria, see how much data is available. In Proceedings of the 2014 International Working Conference on Advanced Visual Interfaces.
    [23]
    Yeye He and Dong Xin. 2011. Seisa: set expansion by iterative similarity aggregation. In Proceedings of the 20th International Conference on World Wide Web (WWW).
    [24]
    Philipp Heim, Sebastian Hellmann, Jens Lehmann, Steffen Lohmann, and Timo Stegemann. 2009. RelFinder: Revealing relationships in RDF knowledge bases. In International Conference on Semantic and Digital Media Technologies.
    [25]
    Michiel Hildebrand, Jacco Van Ossenbruggen, and Lynda Hardman. 2006. /facet: A browser for heterogeneous semantic web repositories. In International Semantic Web Conference.
    [26]
    Frederik Hogenboom, Viorel Milea, Flavius Frasincar, and Uzay Kaymak. 2010. RDF-GL: a SPARQL-based graphical query language for RDF. In Emergent Web Intelligence: Advanced Information Retrieval. 87–116.
    [27]
    Eric Horvitz. 1999. Principles of mixed-initiative user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
    [28]
    Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
    [29]
    Xiao Ling, Alon Y Halevy, Fei Wu, and Cong Yu. 2013. Synthesizing union tables from the web. In Twenty-Third International Joint Conference on Artificial Intelligence.
    [30]
    Steffen Lohmann, Vincent Link, Eduard Marbach, and Stefan Negru. 2014. WebVOWL: Web-based visualization of ontologies. In International Conference on Knowledge Engineering and Knowledge Management.
    [31]
    András Micsik, Zoltán Tóth, and Sándor Turbucz. 2013. Lodmilla: Shared visualization of linked open data. In International Conference on Theory and Practice of Digital Libraries.
    [32]
    John Morcos, Ziawasch Abedjan, Ihab Francis Ilyas, Mourad Ouzzani, Paolo Papotti, and Michael Stonebraker. 2015. Dataxformer: An interactive data transformation tool. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data.
    [33]
    Fatemeh Nargesian, Erkang Zhu, Ken Q Pu, and Renée J Miller. 2018. Table union search on open data. Proceedings of the VLDB Endowment 11, 7 (2018).
    [34]
    Deutsche Nationalbibliotek. 2016. The Linked Data Service of the German National Library. https://www.dnb.de/SharedDocs/Downloads/EN/Professionell/Metadatendienste/linkedDataModellierungTiteldaten.pdf?__blob=publicationFile&v=2.
    [35]
    Heiko Paulheim and Lars Meyer. 2011. Ontology-based information visualization in integrated UIs. In Proceedings of the 16th International Conference on Intelligent User Interfaces (IUI).
    [36]
    Rakesh Pimplikar and Sunita Sarawagi. 2012. Answering table queries on the web using column keywords. Proceedings of the VLDB Endowment 5, 10 (2012).
    [37]
    Vijayshankar Raman and J Hellerstein. 2001. Potters wheel: an interactive framework for data cleaning and transformation. Working Draft (2001).
    [38]
    Juan F Sequeda and Daniel P Miranker. 2015. Ultrawrap Mapper: A Semi-Automatic Relational Database to RDF (RDB2RDF) Mapping Tool. In International Semantic Web Conference (Posters & Demos).
    [39]
    Zhaohua Sheng, Xin Wang, Hong Shi, and Zhiyong Feng. 2012. Checking and handling inconsistency of DBpedia. In International Conference on Web Information Systems and Mining.
    [40]
    Pavel Shvaiko and Jérôme Euzenat. 2005. A Survey of Schema-Based Matching Approaches. In Journal on Data Semantics IV. Vol. 3730.
    [41]
    Paul R Smart, Alistair Russell, Dave Braines, Yannis Kalfoglou, Jie Bao, and Nigel R Shadbolt. 2008. A visual approach to semantic query design using a web-based graphical query designer. In International Conference on Knowledge Engineering and Knowledge Management. Springer.
    [42]
    Ahmet Soylu, Martin Giese, Ernesto Jimenez-Ruiz, Guillermo Vega-Gorgojo, and Ian Horrocks. 2016. Experiencing OptiqueVQS: a multi-paradigm and ontology-based visual query system for end users. Universal Access in the Information Society 15, 1 (2016).
    [43]
    Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web (WWW).
    [44]
    Huan Sun, Hao Ma, Xiaodong He, Wen-tau Yih, Yu Su, and Xifeng Yan. 2016. Table cell search for question answering. In Proceedings of the 25th International Conference on World Wide Web (WWW). 771–782.
    [45]
    Klaudia Thellmann, Michael Galkin, Fabrizio Orlandi, and Sören Auer. 2015. LinkDaViz–automatic binding of linked data to visualizations. In International Semantic Web Conference.
    [46]
    Yannis Tzitzikas, Nikos Manolis, and Panagiotis Papadakos. 2017. Faceted exploration of RDF/S datasets: a survey. Journal of Intelligent Information Systems 48, 2 (2017).
    [47]
    Hernán Vargas, Carlos Buil-Aranda, Aidan Hogan, and Claudia López. 2019. RDF Explorer: A Visual SPARQL Query Builder. In International Semantic Web Conference. 647–663.
    [48]
    Denny Vrandečić. 2012. Wikidata: A new platform for collaborative data collection. In Proceedings of the 21st International Conference on World Wide Web (WWW).
    [49]
    Chi Wang, Kaushik Chakrabarti, Yeye He, Kris Ganjam, Zhimin Chen, and Philip A Bernstein. 2015. Concept expansion using web tables. In Proceedings of the 24th International Conference on World Wide Web (WWW).
    [50]
    Shuo Zhang, Vugar Abdul Zada, and Krisztian Balog. 2018. SmartTable: A Spreadsheet Program with Intelligent Assistance. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.
    [51]
    Shuo Zhang and Krisztian Balog. 2017. Entitables: Smart assistance for entity-focused tables. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.
    [52]
    Shuo Zhang and Krisztian Balog. 2018. Ad hoc table retrieval using semantic similarity. In Proceedings of the 2018 World Wide Web Conference (WWW).
    [53]
    Shuo Zhang and Krisztian Balog. 2018. On-the-fly table generation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.
    [54]
    Weiguo Zheng, Hong Cheng, Lei Zou, Jeffrey Xu Yu, and Kangfei Zhao. 2017. Natural language question/answering: Let users talk with the knowledge graph. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.
    [55]
    Erkang Zhu, Fatemeh Nargesian, Ken Q Pu, and Renée J Miller. 2016. LSH ensemble: internet-scale domain search. Proceedings of the VLDB Endowment 9, 12 (2016).
    [56]
    Erkang Zhu, Ken Q Pu, Fatemeh Nargesian, and Renée J Miller. 2017. Interactive navigation of open data linkages. Proceedings of the VLDB Endowment 10, 12 (2017).
    [57]
    Lei Zou, Ruizhe Huang, Haixun Wang, Jeffrey Xu Yu, Wenqiang He, and Dongyan Zhao. 2014. Natural language question answering over RDF: a graph data driven approach. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data.

    Cited By

    View all
    • (2024)Towards Feature Engineering with Human and AI’s Knowledge: Understanding Data Science Practitioners’ Perceptions in Human&AI-Assisted Feature Engineering DesignProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661517(1789-1804)Online publication date: 1-Jul-2024
    • (2023)Emerging Exploration Strategies of Knowledge GraphsIEEE Access10.1109/ACCESS.2023.330851411(94713-94731)Online publication date: 2023

    Index Terms

    1. KTabulator: Interactive Ad hoc Table Creation using Knowledge Graphs
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
        May 2021
        10862 pages
        ISBN:9781450380966
        DOI:10.1145/3411764
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 May 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Data tables
        2. data cleaning
        3. data integration
        4. database
        5. user interface.

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        • Waterloo-Huawei Joint Innovation Laboratory

        Conference

        CHI '21
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)65
        • Downloads (Last 6 weeks)3
        Reflects downloads up to 27 Jul 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Towards Feature Engineering with Human and AI’s Knowledge: Understanding Data Science Practitioners’ Perceptions in Human&AI-Assisted Feature Engineering DesignProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661517(1789-1804)Online publication date: 1-Jul-2024
        • (2023)Emerging Exploration Strategies of Knowledge GraphsIEEE Access10.1109/ACCESS.2023.330851411(94713-94731)Online publication date: 2023

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media