Challenges for Semantically Driven Collaborative Spaces

Pascal Molli; John G. Breslin; Maria-Esther Vidal

Challenges for Semantically Driven Collaborative Spaces

Semantic Web Collaborative Spaces, 2016

Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Downloaded 2020-06-09T11:14:26Z Some rights reserved. For more information, please see the item record link above. Title Challenges for semantically driven collaborative spaces Author(s) Molli, Pascal; Breslin, John G.; Vidal, Maria-Esther Publication Date 2016 Publication Information Molli, Pascal, Breslin, John G., & Vidal, Maria-Esther. (2016). Challenges for semantically driven collaborative spaces. Paper presented at the Semantic Web Collaborative Spaces: Second International Workshop, SWCS 2013, Montpellier, France, May 27, 2013, Third International Workshop, SWCS 2014, Trentino, Italy, October 19, 2014, Revised Selected and Invited Papers Publisher Springer Link to publisher's version https://www.springer.com/gp/book/9783319326665 Item record http://hdl.handle.net/10379/7427 DOI http://dx.doi.org/10.1007/978-3-319-32667-2 1

Challenges for Semantically Driven Collaborative Spaces Pascal Molli 1 , John G. Breslin 2 , and Maria-Esther Vidal 3(B ) 1 University of Nantes, Nantes, France pascal.molli@univ-nantes.fr 2 Insight Centre for Data Analytics, National University of Ireland Galway, Galway, Ireland john.breslin@nuigalway.ie 3 UniversidadSim´onBol´ ıvar, Caracas, Venezuela mvidal@ldc.usb.ve Abstract. Linked Data initiatives have fostered the publication of more than one thousand of datasets in the Linking Open Data (LOD) cloud from a large variety of domains, e.g., Life Sciences, Media, and Govern- ment. Albeit large in volume, Linked Data is essentially read-only and most collaborative tasks of cleaning, enriching, and reasoning are not dynamically available. Collaboration between data producers and con- sumers is essential for overcoming these limitations, and for fostering the evolution of the LOD cloud into a more participative and collabo- rative data space. In this paper, we describe the role that collaborative infrastructures can play in creating and maintaining Linked Data, and the beneﬁts of exploiting knowledge represented in ontologies as well as the main features of Semantic Web technologies to eﬀectively assess the LOD cloud’s evolution. First, the advantages of using ontologies for mod- elling collaborative spaces are discussed, as well as formalisms for assess- ing semantic collaboration by sharing annotations from terms in domain ontologies. Then, Semantic MediaWiki communities are described, and illustrated with three applications in the domains of formal mathemat- ics, ontology engineering, and pedagogical content management. Next, the problem of exploiting semantics in collaborative spaces is tackled, and three diﬀerent approaches are described. Finally, we conclude with an outlook to future directions and problems that remain open in the area of semantically-driven collaborative spaces. 1 Introduction Over the last decade, there has been a rapid increase in the numbers of users signing up to be part of Web-based social networks. Hundreds of millions of new members are now joining the major services each year. A large amount of content is being shared on these networks, and around tens of billions of content items are shared each month. In parallel, similar collaborative spaces are being leveraged in both private intranets and enterprise environments; these collaborative spaces have features mirroring those on the public Web.

Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title Author(s) Challenges for semantically driven collaborative spaces Molli, Pascal; Breslin, John G.; Vidal, Maria-Esther Publication Date 2016 Publication Information Molli, Pascal, Breslin, John G., & Vidal, Maria-Esther. (2016). Challenges for semantically driven collaborative spaces. Paper presented at the Semantic Web Collaborative Spaces: Second International Workshop, SWCS 2013, Montpellier, France, May 27, 2013, Third International Workshop, SWCS 2014, Trentino, Italy, October 19, 2014, Revised Selected and Invited Papers Publisher Springer Link to publisher's version https://www.springer.com/gp/book/9783319326665 Item record http://hdl.handle.net/10379/7427 DOI http://dx.doi.org/10.1007/978-3-319-32667-2 1 Downloaded 2020-06-09T11:14:26Z Some rights reserved. For more information, please see the item record link above. Challenges for Semantically Driven Collaborative Spaces Pascal Molli1 , John G. Breslin2 , and Maria-Esther Vidal3(B) 1 2 University of Nantes, Nantes, France pascal.molli@univ-nantes.fr Insight Centre for Data Analytics, National University of Ireland Galway, Galway, Ireland john.breslin@nuigalway.ie 3 Universidad Simón Bolı́var, Caracas, Venezuela mvidal@ldc.usb.ve Abstract. Linked Data initiatives have fostered the publication of more than one thousand of datasets in the Linking Open Data (LOD) cloud from a large variety of domains, e.g., Life Sciences, Media, and Government. Albeit large in volume, Linked Data is essentially read-only and most collaborative tasks of cleaning, enriching, and reasoning are not dynamically available. Collaboration between data producers and consumers is essential for overcoming these limitations, and for fostering the evolution of the LOD cloud into a more participative and collaborative data space. In this paper, we describe the role that collaborative infrastructures can play in creating and maintaining Linked Data, and the benefits of exploiting knowledge represented in ontologies as well as the main features of Semantic Web technologies to effectively assess the LOD cloud’s evolution. First, the advantages of using ontologies for modelling collaborative spaces are discussed, as well as formalisms for assessing semantic collaboration by sharing annotations from terms in domain ontologies. Then, Semantic MediaWiki communities are described, and illustrated with three applications in the domains of formal mathematics, ontology engineering, and pedagogical content management. Next, the problem of exploiting semantics in collaborative spaces is tackled, and three different approaches are described. Finally, we conclude with an outlook to future directions and problems that remain open in the area of semantically-driven collaborative spaces. 1 Introduction Over the last decade, there has been a rapid increase in the numbers of users signing up to be part of Web-based social networks. Hundreds of millions of new members are now joining the major services each year. A large amount of content is being shared on these networks, and around tens of billions of content items are shared each month. In parallel, similar collaborative spaces are being leveraged in both private intranets and enterprise environments; these collaborative spaces have features mirroring those on the public Web. With this growth in usage and data being generated, there are many opportunities to discover the knowledge that is often inherent but somewhat hidden in these networks. Web mining techniques are being used to derive this hidden knowledge. In addition, Semantic Web technologies, including Linked Data initiatives to connect previously disconnected datasets, are making it possible to connect data from across various social spaces through common representations and agreed upon terms for people, content items, etc. In this volume, we will outline some current research being carried out to semantically represent the implicit and explicit structures on the Social Web, along with the techniques being used to elicit relevant knowledge from these structures, and the mechanisms that can be used to intelligently mesh these semantic representations with intelligent knowledge discovery processes. 2 Modelling Collaborative Communities and the Role of Semantics Semantics represented in ontologies or vocabularies can be used to enhance the description and modelling of collaborative spaces. In particular, annotating data with terms from ontologies is a common activity that has gained attention with the development of Semantic Web technologies. Scientific communities from natural sciences such as Life Sciences have actively used ontologies to describe the semantics of scientific concepts. The Gene and Human Phenotype Ontologies have been extensively applied for describing genes, and there are international initiatives to collaboratively annotate organisms, e.g., the Pseudomonas aeruginosa PAO1 genome1 . One of the main goals to be achieved to support a precise modelling in collaborative spaces is the development of tools that conduct reasoning on top of existing ontologies and allow for the collaborative annotation of these entities. In this direction, Goy et al. [3] propose a model to represent views of a data set and different versions of annotations of the data enclosed in the view. Annotations can be personal, allowing for the representation of individual conceptualisations of the portion of the domain represented by the view. Additionally, annotations can be shared in case the annotations can be visible to all members of the collaborative space. The proposed model is part of the project Semantic Table Plus Plus (SemT++) [4] which provides a platform to collaboratively describe web resources, i.e., to semantically describe images, documents, videos, or any other resource publicly available on the Web. Existing ontologies are provided to model knowledge about resources represented using the ontology annotations, e.g., the DOLCE2 and Geographic ontologies3 provide controlled vocabularies to describe Web resources. The benefits of using personal annotations are illustrated in a use case where documents are collaboratively described following an authored collaboration policy. This policy allows an authorised user 1 2 3 http://www.pseudomonas.com/goannotationproject2014.jsp. http://www.loa.istc.cnr.it/old/DOLCE.html. https://www.w3.org/2005/Incubator/geo/XGR-geo-ont-20071023/. to delete other user annotations if there is no agreement across all users, while the annotations may remain in the user’s local view. 3 Semantic MediaWiki Communities For over ten years on Wikipedia, templates have been used to provide a consistent look to the structured content placed within article texts (these are called infoboxes on Wikipedia). They can also be used to provide a structure for entering data, so that it is possible to easily extract metadata about some aspect of an entity that is the focus of an article (e.g., from a template field called ‘population’ in an article about Galway). Semantic wikis bring this to the next level by allowing users to create semantic annotations anywhere within a wiki article’s text for the purposes of structured access and finer-grained searches, inline querying, and external information reuse. These are very useful in enterprise scenarios when wikis are used as collaborative spaces where structured data can be easily entered and updated by a distributed community of enterprise users. One of the largest semantic wikis is Semantic MediaWiki, based on the popular MediaWiki system. In the context of domain-specific applications, Kaliszyk and Urban [5] provide an overview of collaborative systems for collecting and sharing mathematical knowledge. One of the problems to be solved by these systems is the visualisation of formal proofs for people, while providing assistance with the translation of informal statements into formal mathematics. Different problems regarding disagreements between mathematicians, e.g., in terms of the lack of a unique formal language and axiomatic systems, have complicated the development of such formalisation frameworks, necessitating collaborative work like the one implemented in wiki applications such as Wikipedia. These collaborative systems or formal wikis allow for formal verification, the implementation of formal libraries to provide a unified terminology and theorems, versioning of the proofs, semantic resolution of ambiguities, collaborative editing, and other semantic tools. The authors illustrate some of these features in their systems.The Mizar Wiki [1] provides a mathematical library that includes theorems that can be reused in other proofs; additionally, Mizar makes available a proof checker to validate the proposed proofs and to facilitate a peer review process. Furthermore, ProofWeb is a web interface for editing pages and assisting users during theorem demonstration. The authors conclude that there are many challenges to be achieved before collaborative systems in formal mathematics can become a reality. In particular, semantics can play a relevant role in future developments. A paper by Rutledge et al. [7] introduces a technique to annotate and browse a given ontology based on Fresnel forms4 . Fresnel forms are created via a Protegé plugin that allows users to edit and reuse the ontologies, and also to specify how the semantics should be browsed/displayed (using CSS to create a suitable visualisation for a given data structure). The result is a useful tool that exploits the semantics encoded in an ontology to generate semantic wikis. In particular, 4 http://is.cs.ou.nl/OWF/index.php5/Fresnel Forms. the authors apply this technique to bridge Semantic MediaWiki with a browsing interface and semantic forms. Evaluation of this approach was carried out in a case study where the authors discussed the feasibility of implementing their tool via its application to the well-known FOAF ontology, and its usefulness in a Wikipedia infobox-style interface. Zander et al. [9] tackle the problem of pedagogical content management and evaluate how semantic collaborative infrastructures like semantic wikis can have a positive impact on content reusability and authoring. The novelty of this approach relies on an extension to semantic wikis with knowledge encoded in ontologies like the Pedagogical Ontology (PO) and the Semantic Learning Object Model (SLOM5 ) to enhance expressiveness and interoperability across multiple curricula and pedagogies. Furthermore, the combination of these technologies facilitates the generation of pedagogical knowledge as Linked Data. The effectiveness of the proposed framework is empirically evaluated with a user study and measured using a usability test. Reported results suggest that non-computer specialists are highly satisfied with semantic wikis enhanced with pedagogical knowledge, facilitating the task of creating semantically enriched content for teaching and learning. 4 Exploiting Semantics in Collaborative Spaces Semantics encoded in ontologies and Linked Data sets can have a positive impact on the behaviour of applications built on top of collaborative spaces. The main challenge to be addressed is how to efficiently extract the knowledge represented in existing Linked Data sets and effectively use this knowledge to enhance existing collaborative spaces. Improvements can be of diverse types, e.g., inferred facts from DBpedia can be used for quality assessment of collaborative space contents; thus, diverse methods to uncover data quality problems should be defined. Additionally, queries against these Linked Data sets have to be crafted in such a way that query answers will provide the insights to discover the missing or faulty content in the collaborative space. Finally, evaluation methodologies are required to determine the quality of the discoveries and to precisely propose changes that will assess high quality content of collaborative spaces. This special issue compiles two exemplar applications where the benefits of using semantics are clearly demonstrated. First, Torres et al. [8] present BlueFinder, a recommendation system able to enhance Wikipedia content with information retrieved from DBpedia6 . BlueFinder identifies the classes to which a pair of DBpedia resources belong to, and uses this information to feed an unsupervised learning algorithm that recommends new associations between disconnected Wikipedia articles. The behaviour of BlueFinder is empirically studied in terms of the number of missing Wikipedia connections that BlueFinder can detect; reported results suggest that by exploiting the semantics encoded in DBpedia, BlueFinder is able to identify 270,367 new Wikipedia connections. 5 6 http://www.intuitel.de/. http://wiki.dbpedia.org/. BlueFinder addresses the challenges of uncovering missing content in Wikipedia and designing DBpedia queries to recover missing values. As a future direction, the authors propose extending BlueFinder to allow collaborative validation of these values via crowdsourcing and to include the discovered links in Wikipedia. Following this line of research, Louati et al. [6] tackle the problem of aggregating heterogeneous social networks, and propose a hybrid graph summarisation approach. The proposed technique extends existing clustering methods such as K-medoids and hierarchical clustering to cluster heterogeneous social networks. The novelty of this approach relies on the usage of attributes and relationships represented in the network, to produce clusters that better fill user requirements. The proposed summarisation techniques implement an unsupervised learning algorithm that uses Rough Set Theory to enhance the precision of known clustering methods: K-medoids and hierarchical. The proposed approach is empirically evaluated in an existing social network data set on US political books7 . The results suggest that exploiting the semantics encoded in the attributes and relationships positively impacts on the quality of the communities identified by the proposed techniques. The challenges achieved by the authors provide the basis for the development of tools for uncovering patterns of the data that suggest quality problems in the content of the collaborative space. In the future, the authors plan to extend the proposed graph summarisation techniques to dynamically adapt the attributes and relationships used to cluster the input network. This extension will allow for the representation of more general domain restrictions, and in consequence, a more expressive semantically-driven clustering approach. If Semantic Web technologies have allowed many data providers to publish datasets with their own ontologies, consuming these datasets requires aligning different ontologies and performing entity matching. The Semantic Web community has developed sophisticated tools to tackle this problem, but a generic automatic reliable solution is still an open research problem. Human-machine collaboration can be effective to promote reuse of contextual trusted ontology mappings. This special issue also includes work by Bottali et al. [2] on Okkam, a collaborative tool designed to share ontology mappings. Okkam allows users to disagree on mappings, but also to rank mappings according to their contextual sharedness. Consequently, Okkam allows users to quickly find mappings adapted to their usage, knowing their level of agreement for a given particular context. 5 Future Directions A large research effort has enabled the continuous transformation of content to knowledge, building the Semantic Web from a Web of Documents and the Deep Web. A new major research challenge is to ensure a co-evolution of content and knowledge, making both of them trustable. This means that we must be able to not only extract and manage knowledge from contents, but also to augment contents based on knowledge. Co-evolution of content and knowledge happens in a semantic wiki, but the question is how can we scale this at the levels of the Web of Documents and the Web of Data. 7 http://www-personal.umich.edu/∼mejn/netdata/. Co-evolution of content and knowledge is a powerful mechanism to improve both content and knowledge, but such an evolution has to be powered by a new form of collaboration between humans and AI. From the point of view of people, human-AI collaboration means that we must make formal knowledge and its evolution accessible, usable, editable and understandable, so that people can observe, control, evaluate and reuse this formal knowledge. All research works related to explanations and knowledge revision follow this direction. In the other direction, from the point of view of computers, human-AI collaboration means that we must be able to take into account the unpredictable behaviours of people who can at any moment in time add or modify content and formal knowledge, with the risk of introducing uncertainty or inconsistency. Research works related to uncertainty management in crowdsourcing and truthfinding algorithms are of interest here. Bootstrapping the co-evolution of content and knowledge requires that we find new ways for humans and AI to collaborate. Knowledge benefits, i.e. querying, reasoning, fact checking, and truth finding, have to be easily available for any people using the Web. On the other hand, knowledge quality brings with it really challenging questions when deployed, for example, if the mistakes found by people contradict with (and should override) what is formally stated in accepted knowledge bases. Finally, the co-evolution of content and knowledge has to be monitored, ensuring that the overall quality of content and knowledge improves. Acknowledgements. Pascal Molli has been supported by the ANR Kolflow Project (ANR-10-CORD-0021), University of Nantes. John G. Breslin has been supported by Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289 (Insight). Maria-Esther Vidal thanks the DID-USB (Decanato de Investigación y Desarrollo de la Universidad Simón Bolı́var). References 1. Alama, J., Brink, K., Mamane, L., Urban, J.: Large formal wikis: issues and solutions. In: Davenport, J.H., Farmer, W.M., Urban, J., Rabe, F. (eds.) MKM 2011 and Calculemus 2011. LNCS, vol. 6824, pp. 133–148. Springer, Heidelberg (2011) 2. Bortoli, S., Bouquet, P., Bazzanella, B.: Okkam synapsis: connecting vocabularies across systems and users. In: Advances in Semantic Web Collaborative Spaces, Revised Selected Papers of the Semantic Web Collaborative Spaces (SWCS) 2013 and 2014 (2016) 3. Goy, A., Magro, D., Petrone, G., Picardi, C., Segnan, M.: Shared and personal views on collaborative semantic tables. In: Advances in Semantic Web Collaborative Spaces, Revised Selected Papers of the Semantic Web Collaborative Spaces (SWCS) 2013 and 2014 (2016) 4. Goy, A., Magro, D., Petrone, G., Segnan, M.: Collaborative semantic tables. In: Proceedings of the Third International Workshop on Semantic Web Collaborative Spaces Co-located with the 13th International Semantic Web Conference (ISWC), Riva del Garda, Italy, 19 October 2014 5. Kaliszyk, C., Urban, J.: Wikis and collaborative systems for large formal mathematics. In: Advances in Semantic Web Collaborative Spaces, Revised Selected Papers of the Semantic Web Collaborative Spaces (SWCS) 2013 and 2014 (2016) 6. Louati, A., Aufaure, M.-A., Cuvelier, E., Pimentel, B.: Soft and adaptive aggregation of heterogeneous graphs with heterogeneous attributes. In: Advances in Semantic Web Collaborative Spaces, Revised Selected Papers of the Semantic Web Collaborative Spaces (SWCS) 2013 and 2014 (2016) 7. Rutledge, L., Brenninkmeijer, T., Zwanenberg, T., van de Heijning, J., Mekkering, A., Theunissen, J.N., Bos, R.: From ontology to semantic wiki – designing annotation and browse interfaces for given ontologies. In: Advances in Semantic Web Collaborative Spaces, Revised Selected Papers of the Semantic Web Collaborative Spaces (SWCS) 2013 and 2014 (2016) 8. Torres, D., Skaf-Molli, H., Molli, P., Diaz, A.: Discovering wikipedia conventions using DBpedia properties. In: Advances in Semantic Web Collaborative Spaces, Revised Selected Papers of the Semantic Web Collaborative Spaces (SWCS) 2013 and 2014 (2016) 9. Zander, S., Swertz, C., Verdu, E., Perez, M.J.V., Henning, P.: A semantic mediawikibased approach for the collaborative development of pedagogically meaningful learning content annotations. In: Advances in Semantic Web Collaborative Spaces, Revised Selected Papers of the Semantic Web Collaborative Spaces (SWCS) 2013 and 2014 (2016)

Log In

Challenges for Semantically Driven Collaborative Spaces