Abstract. This joint tutorial of the consortium of the European IP project LOD2-Creating Knowledg... more Abstract. This joint tutorial of the consortium of the European IP project LOD2-Creating Knowledge out of Interlinked Data will give an overview on the area of creating, managing and using Linked Data sources.
The performance of triple stores is one of the major obstacles for the deployment of semantic tec... more The performance of triple stores is one of the major obstacles for the deployment of semantic technologies in many usage scenarios. In particular, Semantic Web applications, which use triple stores as persistence backends, trade performance for the advantage of flexibility with regard to information structuring.
Abstract. The WebID protocol enables the global identification and authentication of agents in a ... more Abstract. The WebID protocol enables the global identification and authentication of agents in a distributed manner by combining asymmetric cryptography and Linked Data. In order to decide whether access should be granted or denied to a particular WebID, the authenticating web server may need to retrieve other profiles and linked resources to work out if the requesting agent is member of an authorized group (eg friends of the resource owner's friends).
Gegenwärtig erfreut sich das World Wide Web einer nie dagewesenen Beliebtheit. Nie gab es soviele... more Gegenwärtig erfreut sich das World Wide Web einer nie dagewesenen Beliebtheit. Nie gab es soviele Webseiten und soviele Benutzer, die Informationen dort veröffentlichen oder recherchieren. Mit der wachsenden Größe des World Wide Web steigt natürlich die Schwierigkeit für die Benutzer genau die Daten zu finden, die sie auch suchen. Aus diesem Grund rückt die Idee des Semantic Web (vgl. Kapitel 2.1) immer weiter in den Vordergrund.
Abstract. With the growth of the Linked Data Web, time-efficient approaches for computing links b... more Abstract. With the growth of the Linked Data Web, time-efficient approaches for computing links between data sources have become indispensable. Yet, in many cases, determining the right specification for a link discovery problem is a tedious task that must still be carried out manually. In this article we present RAVEN, an approach for the semiautomatic determination of link specifications.
ABSTRACT Since its inception in the early 2000s, Wiki technology became a ubiquitous pillar for e... more ABSTRACT Since its inception in the early 2000s, Wiki technology became a ubiquitous pillar for enabling large-scale collaboration. However, the Wiki paradigm was mainly applied to unstructured, textual content thus limiting the content structuring, repurposing and reuse. More recently with the appearance of Semantic Wiki's the Wiki concept was also applied and extended towards semantic content with adverse effects on scalability.
The paneuropean Project LOD2 (Linking Open Data) is one of the biggest projects dealing with link... more The paneuropean Project LOD2 (Linking Open Data) is one of the biggest projects dealing with linked data. Scientists, programmers and software architects in various european countries are working on the next generation of linked open data. In a series of interviews Thomas Thruner of the Semantic Web Company (SWC) is presenting people working on and with LOD2. As a start SWC talked to Sören Auer, head of the LOD2 project.
Abstract. We present a declarative approach implemented in a comprehensive open-source framework ... more Abstract. We present a declarative approach implemented in a comprehensive open-source framework based on DBpedia to extract lexicalsemantic resources–an ontology about language use–from Wiktionary. The data currently includes language, part of speech, senses, definitions, synonyms, translations and taxonomies (hyponyms, hyperonyms, synonyms, antonyms) for each lexical word. Main focus is on flexibility to the loose schema and configurability towards differing language-editions of Wiktionary.
Recently practical approaches for managing and supporting the life-cycle of semantic content on t... more Recently practical approaches for managing and supporting the life-cycle of semantic content on the Web of Data made quite some progress. However, the currently least developed aspect of the semantic content life-cycle is the userfriendly manual and semi-automatic creation of rich semantic content.
Abstract: The current practices of research funding do not yet use means of communication and col... more Abstract: The current practices of research funding do not yet use means of communication and collaboration of the Internet age effectively. Combined with a number of information flow barriers associated with research funding this results in inefficiencies and intransparencies. We present a vision how an open science platform for research funding and cross-fertilization could be realized. It is based on stake-holder involvement and community self-organisation.
Abstract. Many companies strive to increase their value proposition to the traditional Web search... more Abstract. Many companies strive to increase their value proposition to the traditional Web search engine and to novel applications. With the increased popularity of the semantic web and Linked Open Data this paper is presenting a method to create rich semantic annotations using the RDFaCE approach. The approach is based on providing different views to the content authors such as a classical WYSIWYG view and a WYSIWYM (What You See Is What You Mean) view making the semantic annotations visible.
Abstract. Linked Open Data (LOD) comprises of an unprecedented volume of structured data being av... more Abstract. Linked Open Data (LOD) comprises of an unprecedented volume of structured data being available on the Web. However, these datasets are of very varying quality ranging from extensively curated datasets to crowd-sourced and even extracted data of relatively low quality. We present a methodology for crowd-sourcing the quality assessment of linked data resources. The first step of the methodology comprises the detection of common quality problems and their representation in a quality problem taxonomy.
Abstract. Accessing the wealth of structured data available on the Data Web is still a key challe... more Abstract. Accessing the wealth of structured data available on the Data Web is still a key challenge for lay users. Keyword search is the most convenient way for users to access information (eg, from data repositories). In this paper we introduce a novel approach for determining the correct resources for user-supplied keyword queries based on a hidden Markov model. In our approach the user-supplied query is modeled as the observed data and the background knowledge is used for parameter estimation.
Semantische Mashups sind Anwendungen, die vernetzte Daten aus mehreren Web-Datenquellen mittels s... more Semantische Mashups sind Anwendungen, die vernetzte Daten aus mehreren Web-Datenquellen mittels standardisierter Datenformate und Zugriffsmechanismen nutzen. Der Artikel gibt einen Überblick über die Idee und Motivation der Vernetzung von Daten. Es werden verschiedene Architekturen und Ansätze zur Generierung von RDF-Daten aus bestehenden Web 2.0-Datenquellen, zur Vernetzung der extrahierten Daten sowie zur Veröffentlichung der Daten im Web anhand konkreter Beispiele diskutiert.
Abstract: The recent success of the Semantic Web in research, technology and standardisation comm... more Abstract: The recent success of the Semantic Web in research, technology and standardisation communities has also resulted in a large variety of different standards, technologies and tools. This diversity and heterogeneity goes along with an increasing complexity in assessing, evaluating, selecting and combining different approaches for the development of Semantic Web Applications (SWA).
In this paper we tackle some pressing obstacles of the emerging Linked Data Web, namely the quali... more In this paper we tackle some pressing obstacles of the emerging Linked Data Web, namely the quality, timeliness and coherence of data, which are prerequisites in order to provide direct end user benefits. We present an approach for complementing the Linked Data Web with a social dimension by extending the well-known Pingback mechanism, which is a technological cornerstone of the blogosphere, towards a Semantic Pingback.
The NLP Interchange Format (NIF) is an RDF/OWL-based format that provides interoperability betwee... more The NLP Interchange Format (NIF) is an RDF/OWL-based format that provides interoperability between Natural Language Processing (NLP) tools, language resources and annotations by allowing NLP tools to exchange annotations about text documents in RDF. Other than more centralized solutions such as UIMA and GATE, NIF enables the creation of heterogeneous, distributed and loosely coupled NLP applications, which use the Web as an integration platform.
Purpose–DBpedia extracts structured information from Wikipedia, interlinks it with other knowledg... more Purpose–DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the web using Linked Data and SPARQL. However, the DBpedia release process is heavyweight and releases are sometimes based on several months old data. DBpedia-Live solves this problem by providing a live synchronization method based on the update stream of Wikipedia. This paper seeks to address these issues.
Abstract. Evidence-based policy is policy informed by rigorously established objective evidence. ... more Abstract. Evidence-based policy is policy informed by rigorously established objective evidence. An important aspect of evidence-based policy is the use of scientifically rigorous studies to identify programs and practices capable of improving policy relevant outcomes. Statistics represent a crucial means to determine whether progress is made towards policy targets.
Abstract. Despite decades of effort, intelligent object search remains elusive. Neither search en... more Abstract. Despite decades of effort, intelligent object search remains elusive. Neither search engine nor semantic web technologies alone have managed to provide usable systems for simple questions such as “find me a flat with a garden and more than two bedrooms near a supermarket.” We introduce deqa, a conceptual framework that achieves this elusive goal through combining state-of-the-art semantic technologies with effective data extraction.
Abstract. This joint tutorial of the consortium of the European IP project LOD2-Creating Knowledg... more Abstract. This joint tutorial of the consortium of the European IP project LOD2-Creating Knowledge out of Interlinked Data will give an overview on the area of creating, managing and using Linked Data sources.
The performance of triple stores is one of the major obstacles for the deployment of semantic tec... more The performance of triple stores is one of the major obstacles for the deployment of semantic technologies in many usage scenarios. In particular, Semantic Web applications, which use triple stores as persistence backends, trade performance for the advantage of flexibility with regard to information structuring.
Abstract. The WebID protocol enables the global identification and authentication of agents in a ... more Abstract. The WebID protocol enables the global identification and authentication of agents in a distributed manner by combining asymmetric cryptography and Linked Data. In order to decide whether access should be granted or denied to a particular WebID, the authenticating web server may need to retrieve other profiles and linked resources to work out if the requesting agent is member of an authorized group (eg friends of the resource owner's friends).
Gegenwärtig erfreut sich das World Wide Web einer nie dagewesenen Beliebtheit. Nie gab es soviele... more Gegenwärtig erfreut sich das World Wide Web einer nie dagewesenen Beliebtheit. Nie gab es soviele Webseiten und soviele Benutzer, die Informationen dort veröffentlichen oder recherchieren. Mit der wachsenden Größe des World Wide Web steigt natürlich die Schwierigkeit für die Benutzer genau die Daten zu finden, die sie auch suchen. Aus diesem Grund rückt die Idee des Semantic Web (vgl. Kapitel 2.1) immer weiter in den Vordergrund.
Abstract. With the growth of the Linked Data Web, time-efficient approaches for computing links b... more Abstract. With the growth of the Linked Data Web, time-efficient approaches for computing links between data sources have become indispensable. Yet, in many cases, determining the right specification for a link discovery problem is a tedious task that must still be carried out manually. In this article we present RAVEN, an approach for the semiautomatic determination of link specifications.
ABSTRACT Since its inception in the early 2000s, Wiki technology became a ubiquitous pillar for e... more ABSTRACT Since its inception in the early 2000s, Wiki technology became a ubiquitous pillar for enabling large-scale collaboration. However, the Wiki paradigm was mainly applied to unstructured, textual content thus limiting the content structuring, repurposing and reuse. More recently with the appearance of Semantic Wiki's the Wiki concept was also applied and extended towards semantic content with adverse effects on scalability.
The paneuropean Project LOD2 (Linking Open Data) is one of the biggest projects dealing with link... more The paneuropean Project LOD2 (Linking Open Data) is one of the biggest projects dealing with linked data. Scientists, programmers and software architects in various european countries are working on the next generation of linked open data. In a series of interviews Thomas Thruner of the Semantic Web Company (SWC) is presenting people working on and with LOD2. As a start SWC talked to Sören Auer, head of the LOD2 project.
Abstract. We present a declarative approach implemented in a comprehensive open-source framework ... more Abstract. We present a declarative approach implemented in a comprehensive open-source framework based on DBpedia to extract lexicalsemantic resources–an ontology about language use–from Wiktionary. The data currently includes language, part of speech, senses, definitions, synonyms, translations and taxonomies (hyponyms, hyperonyms, synonyms, antonyms) for each lexical word. Main focus is on flexibility to the loose schema and configurability towards differing language-editions of Wiktionary.
Recently practical approaches for managing and supporting the life-cycle of semantic content on t... more Recently practical approaches for managing and supporting the life-cycle of semantic content on the Web of Data made quite some progress. However, the currently least developed aspect of the semantic content life-cycle is the userfriendly manual and semi-automatic creation of rich semantic content.
Abstract: The current practices of research funding do not yet use means of communication and col... more Abstract: The current practices of research funding do not yet use means of communication and collaboration of the Internet age effectively. Combined with a number of information flow barriers associated with research funding this results in inefficiencies and intransparencies. We present a vision how an open science platform for research funding and cross-fertilization could be realized. It is based on stake-holder involvement and community self-organisation.
Abstract. Many companies strive to increase their value proposition to the traditional Web search... more Abstract. Many companies strive to increase their value proposition to the traditional Web search engine and to novel applications. With the increased popularity of the semantic web and Linked Open Data this paper is presenting a method to create rich semantic annotations using the RDFaCE approach. The approach is based on providing different views to the content authors such as a classical WYSIWYG view and a WYSIWYM (What You See Is What You Mean) view making the semantic annotations visible.
Abstract. Linked Open Data (LOD) comprises of an unprecedented volume of structured data being av... more Abstract. Linked Open Data (LOD) comprises of an unprecedented volume of structured data being available on the Web. However, these datasets are of very varying quality ranging from extensively curated datasets to crowd-sourced and even extracted data of relatively low quality. We present a methodology for crowd-sourcing the quality assessment of linked data resources. The first step of the methodology comprises the detection of common quality problems and their representation in a quality problem taxonomy.
Abstract. Accessing the wealth of structured data available on the Data Web is still a key challe... more Abstract. Accessing the wealth of structured data available on the Data Web is still a key challenge for lay users. Keyword search is the most convenient way for users to access information (eg, from data repositories). In this paper we introduce a novel approach for determining the correct resources for user-supplied keyword queries based on a hidden Markov model. In our approach the user-supplied query is modeled as the observed data and the background knowledge is used for parameter estimation.
Semantische Mashups sind Anwendungen, die vernetzte Daten aus mehreren Web-Datenquellen mittels s... more Semantische Mashups sind Anwendungen, die vernetzte Daten aus mehreren Web-Datenquellen mittels standardisierter Datenformate und Zugriffsmechanismen nutzen. Der Artikel gibt einen Überblick über die Idee und Motivation der Vernetzung von Daten. Es werden verschiedene Architekturen und Ansätze zur Generierung von RDF-Daten aus bestehenden Web 2.0-Datenquellen, zur Vernetzung der extrahierten Daten sowie zur Veröffentlichung der Daten im Web anhand konkreter Beispiele diskutiert.
Abstract: The recent success of the Semantic Web in research, technology and standardisation comm... more Abstract: The recent success of the Semantic Web in research, technology and standardisation communities has also resulted in a large variety of different standards, technologies and tools. This diversity and heterogeneity goes along with an increasing complexity in assessing, evaluating, selecting and combining different approaches for the development of Semantic Web Applications (SWA).
In this paper we tackle some pressing obstacles of the emerging Linked Data Web, namely the quali... more In this paper we tackle some pressing obstacles of the emerging Linked Data Web, namely the quality, timeliness and coherence of data, which are prerequisites in order to provide direct end user benefits. We present an approach for complementing the Linked Data Web with a social dimension by extending the well-known Pingback mechanism, which is a technological cornerstone of the blogosphere, towards a Semantic Pingback.
The NLP Interchange Format (NIF) is an RDF/OWL-based format that provides interoperability betwee... more The NLP Interchange Format (NIF) is an RDF/OWL-based format that provides interoperability between Natural Language Processing (NLP) tools, language resources and annotations by allowing NLP tools to exchange annotations about text documents in RDF. Other than more centralized solutions such as UIMA and GATE, NIF enables the creation of heterogeneous, distributed and loosely coupled NLP applications, which use the Web as an integration platform.
Purpose–DBpedia extracts structured information from Wikipedia, interlinks it with other knowledg... more Purpose–DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the web using Linked Data and SPARQL. However, the DBpedia release process is heavyweight and releases are sometimes based on several months old data. DBpedia-Live solves this problem by providing a live synchronization method based on the update stream of Wikipedia. This paper seeks to address these issues.
Abstract. Evidence-based policy is policy informed by rigorously established objective evidence. ... more Abstract. Evidence-based policy is policy informed by rigorously established objective evidence. An important aspect of evidence-based policy is the use of scientifically rigorous studies to identify programs and practices capable of improving policy relevant outcomes. Statistics represent a crucial means to determine whether progress is made towards policy targets.
Abstract. Despite decades of effort, intelligent object search remains elusive. Neither search en... more Abstract. Despite decades of effort, intelligent object search remains elusive. Neither search engine nor semantic web technologies alone have managed to provide usable systems for simple questions such as “find me a flat with a garden and more than two bedrooms near a supermarket.” We introduce deqa, a conceptual framework that achieves this elusive goal through combining state-of-the-art semantic technologies with effective data extraction.
Uploads
Papers by Sören Auer