The document discusses the prevalence of linked open data on the web. It notes that many large datasets and knowledge bases, such as DBpedia, Google Knowledge Graph, and others have been released as linked data. It also provides a long list of over 200 specific linked open data sources that have been published as of August 2014 across various domains.
Report
Share
Report
Share
1 of 19
Download to read offline
More Related Content
Query-Driven Management of Linked Data Quality
1. Query-Driven Management of Linked Data Quality
Fariz Darari
Supervised by: Prof. Werner Nutt
Free University of Bozen-Bolzano
fadirra@gmail.com
Dec 5, 2014
Fariz Darari (UniBZ) Dec 5, 2014 1 / 19
2. Overview
1 Quality Linked Data is Everywhere?
2 Query-Driven Data Quality
3 Research Problems
4 Current Results
Fariz Darari (UniBZ) Dec 5, 2014 2 / 19
3. Section 1
Quality Linked Data is Everywhere?
Fariz Darari (UniBZ) Dec 5, 2014 3 / 19
4. DBpedia
Everybody knows Wikipedia.
DBpedia is a Linked Data version of Wikipedia.
Fariz Darari (UniBZ) Dec 5, 2014 4 / 19
6. Linked Data is Everywhere
Publications
Life Sciences
Cross-Domain
Social Networking
Geographic
Government
Media
User-Generated Content
Linked Datasets as of August 2014
Uniprot
Alexandria
Digital Library
Gazetteer
lobid
UTPL
LOD
Organizations
chem2
bio2rdf
Multimedia
Lab University
Ghent
Enakting
NHS
Open Data
Ecuador
Geo
Ecuador
Serendipity
GovAgriBus
Denmark
Garnica
Plywood
DBpedia
live
URI
Burner
Linguistics
Identifiers
Eionet
RDF
lobid
Resources
Wiktionary
DBpedia
Isocat
Viaf
Umthes
RKB
Explorer
Courseware
Opencyc
Olia
Gem.
Thesaurus
Audiovisuele
Archieven
Diseasome
FU-Berlin
Eurovoc
in
SKOS
Greek
Wordnet
DNB
GND
Cornetto
Bio2RDF
Pubmed
Bio2RDF
NDC
Bio2RDF
Mesh
IDS
Ontos
News
Portal
AEMET
ineverycrea
Linked
User
Feedback
Museos
Espania
GNOSS
Europeana
Nomenclator
Asturias
Red Uno
Internacional
GNOSS
Geo
Wordnet
Bio2RDF
HGNC
Ctic
Public
Dataset
Bio2RDF
Homologene
Bio2RDF
Affymetrix
Muninn
World War I
CKAN
Government
Web Integration
for
Linked
Data
Universidad
de Cuenca
Linkeddata
Freebase
Linklion
Ariadne
Organic
Edunet
Gene
Expression
Atlas RDF
Chembl
RDF
Biosamples
RDF
Identifiers
Org
Biomodels
RDF
Reactome
RDF
Disgenet
Semantic
Quran
IATI as
Linked Data
Dutch
Ships and
Sailors
Verrijktkoninkrijk
IServe
Arago-dbpedia
Linked
TCGA
ABS
270a.info
RDF
License
Environmental
Applications
Reference
Thesaurus
Thist
JudaicaLink
GovUK
societal
wellbeing
deprv. imd
rank la '10
BPR
OCD
Shoah
Victims
Names
Reload
Data for
Tourists in
Castil a y Leon
2001
Spanish
Census
to RDF
RKB
Explorer
Webscience
RKB
Explorer
Eprints
Harvest
NVS
EU Agencies
Bodies
EPO
Linked
NUTS
RKB
Explorer
Epsrc
Open
Mobile
Network
RKB
Explorer
Lisbon
RKB
Explorer
Italy
CE4R
Environment
Agency
Bathing Water
Quality
RKB
Explorer
Kaunas
Open
Data
Thesaurus
RKB
Explorer
Wordnet
RKB
Explorer
ECS
Austrian
Ski
Racers
Social-semweb
Thesaurus
Data
Open
Ac Uk
RKB
Explorer
IEEE
RKB
Explorer
LAAS
RKB
Explorer
Wiki
RKB
Explorer
JISC
RKB
Explorer
Eprints
RKB
Explorer
Pisa
RKB
Explorer
Darmstadt
RKB
Explorer
unlocode
RKB
Explorer
Newcastle
RKB
Explorer
OS
RKB
Explorer
Curriculum
RKB
Explorer
Resex
RKB
Explorer
Roma
RKB
Explorer
Eurecom
RKB
Explorer
IBM
RKB
Explorer
NSF
RKB
Explorer
kisti
RKB
Explorer
DBLP
RKB
Explorer
ACM
RKB
Explorer
Citeseer
RKB
Explorer
Southampton
RKB
Explorer
Deepblue
RKB
Explorer
Deploy
RKB
Explorer
Risks
RKB
Explorer
ERA
RKB
Explorer
OAI
RKB
Explorer
FT
RKB
Explorer
Ulm
RKB
Explorer
Irit
RKB
Explorer
RAE2001
RKB
Explorer
Dotac
RKB
Explorer
Budapest
Swedish
Open Cultural
Heritage
Radatana
Courts
Thesaurus
German
Labor Law
Thesaurus
GovUK
Transport
Data
GovUK
Education
Data
Enakting
Mortality
Enakting
Energy
Enakting
Crime
Enakting
Population
Enakting
CO2Emission
RKB
Explorer
Crime
RKB
Explorer
cordis
Govtrack
Geological
Survey of
Austria
Thesaurus
Geo
Linked
Data
Gesis
Thesoz
Bio2RDF
Pharmgkb
Bio2RDF
Sabiork
Bio2RDF
Ncbigene
Bio2RDF
Irefindex
Bio2RDF
Iproclass
Bio2RDF
GOA
Bio2RDF
Drugbank
Bio2RDF
DBSNP
Bio2RDF
CTD
Bio2RDF
Biomodels
Bio2RDF
Clinicaltrials
Bio2RDF
LSR
Bio2RDF
Orphanet
Bio2RDF
Wormbase
BIS
270a.info
DM2E
DBpedia
PT
DBpedia
ES
DBpedia
CS
DBnary
Alpino
RDF
YAGO
Pdev
Lemon
Lemonuby
Ietflang
Core
KUPKB
Getty
AAT
Semantic
Web
Journal
OpenlinkSW
Dataspaces
MyOpenlink
Dataspaces
Jugem
Typepad
Aspire
Harper
Adams
NBN
Resolving
Worldcat
Bio2RDF
Bio2RDF
ECO
Taxon-concept
Assets
Indymedia
GovUK
Societal
Wel being
Deprivation imd
Employment
Rank La 2010
GNU
Licenses
DBpedia
CIPFA
Yso.fi
Allars
Glottolog
StatusNet
Bonifaz
StatusNet
shnoulle
Revyu
StatusNet
Kathryl
Charging
Stations
Aspire
UCL
Tekord
Didactalia
Artenue
Vosmedios
GNOSS
Linked
Crunchbase
ESD
Standards
VIVO
University
of Florida
Bio2RDF
SGD
Resources
Product
Ontology
Datos
Bne.es
StatusNet
Mrblog
Bio2RDF
Dataset
EUNIS
GovUK
Housing
Market
LCSH
GovUK
Transparency
Impact ind.
Households
In temp.
Accom.
Uniprot
KB
StatusNet
Timttmy
Semantic
Web
Grundlagen
GovUK
Input ind.
Local Authority
Funding From
Government
Grant
StatusNet
Fcestrada
JITA
StatusNet
Somsants
StatusNet
Ilikefreedom
Drugbank
FU-Berlin
Semanlink
StatusNet
Dtdns
StatusNet
Status.net
DCS
Sheffield
Athelia
RFID
StatusNet
Tekk
Lista
Encabeza
Mientos
Materia
StatusNet
Fragdev
Morelab
DBTune
John Peel
Sessions
RDFize
last.fm
Open
Data
Euskadi
GovUK
Transparency
Input ind.
Local auth.
Funding f.
Gvmnt. Grant
MSC
Lexinfo
StatusNet
Equestriarp
Asn.us
GovUK
Societal
Wel being
Deprivation Imd
Health Rank la
2010
StatusNet
Macno
Oceandril ing
Borehole
Aspire
Qmul
GovUK
Impact
Indicators
Planning
Applications
Granted
Loius
Datahub.io
StatusNet
Maymay
Prospects
and
Trends
GNOSS
GovUK
Transparency
Impact Indicators
Energy Efficiency
new Builds
DBpedia
EU
Bio2RDF
Taxon
StatusNet
Tschlotfeldt
Jamendo
DBTune
Aspire
NTU
GovUK
Societal
Wel being
Deprivation Imd
Health Score
2010
Lotico
GNOSS
Uniprot
Metadata
Linked
Eurostat
Aspire
Sussex
Lexvo
Linked
Geo
Data
StatusNet
Spip
SORS
GovUK
Homeless-ness
Accept. per
1000
TWC
IEEEvis
Aspire
Brunel
PlanetData
Project
Wiki
StatusNet
Freelish
Statistics
data.gov.uk
StatusNet
Mulestable
Enipedia
UK
Legislation
API
Linked
MDB
StatusNet
Qth
Sider
FU-Berlin
DBpedia
DE
GovUK
Households
Social let ings
General Needs
Let ings Prp
Number
Bedrooms
Agrovoc
Skos
My
Experiment
Proyecto
Apadrina
GovUK
Imd Crime
Rank 2010
SISVU
GovUK
Societal
Wel being
Deprivation Imd
Housing Rank la
2010
StatusNet
Uni
Siegen
Opendata
Scotland Simd
Education
Rank
StatusNet
Kaimi
GovUK
Households
Accommodated
per 1000
StatusNet
Planetlibre
DBpedia
EL
Sztaki
LOD
DBpedia
Lite
Drug
Interaction
Knowledge
Base
StatusNet
Qdnx
Amsterdam
Museum
AS EDN LOD
RDF
Ohloh
DBTune
artists
last.fm
Aspire
Uclan
Hellenic
Fire Brigade
Bibsonomy
Nottingham
Trent
Resource
Lists
Opendata
Scotland Simd
Income Rank
Randomness
Guide
London
Opendata
Scotland
Simd Health
Rank
Southampton
ECS Eprints
FRB
270a.info
StatusNet
Sebseb01
StatusNet
Bka
ESD
Toolkit
Hellenic
Police
StatusNet
Ced117
Open
Energy
Info Wiki
StatusNet
Lydiastench
Open
Data
RISP
Taxon-concept
Occurences
Bio2RDF
SGD
UIS
270a.info
NYTimes
Linked Open
Data
Aspire
Keele
GovUK
Households
Projections
Population
W3C
Opendata
Scotland
Simd Housing
Rank
ZDB
StatusNet
1w6
StatusNet
Alexandre
Franke
Dewey
Decimal
Classification
StatusNet
Status
StatusNet
doomicile
Currency
Designators
StatusNet
Hi co
Linked
Edgar
GovUK
Households
2008
DOI
StatusNet
Pandaid
Brazilian
Politicians
NHS
Jargon
Theses.fr
Linked
Life
Data
Semantic Web
DogFood
UMBEL
Openly
Local
StatusNet
Ssweeny
Linked
Food
Interactive
Maps
GNOSS
OECD
270a.info
Sudoc.fr
Green
Competitive-ness
GNOSS
StatusNet
Integralblue
WOLD
Linked
Stock
Index
Apache
KDATA
Linked
Open
Piracy
GovUK
Societal
Wel being
Deprv. Imd
Empl. Rank
La 2010
BBC
Music
StatusNet
Quitter
StatusNet
Scoffoni
Open
Election
Data
Project
Reference
data.gov.uk
StatusNet
Jonkman
Project
Gutenberg
FU-Berlin
DBTropes
StatusNet
Spraci
Libris
ECB
270a.info
StatusNet
Thelovebug
Icane
Greek
Administrative
Geography
Bio2RDF
OMIM
StatusNet
Orangeseeds
National
Diet Library
WEB NDL
Authorities
Uniprot
Taxonomy
DBpedia
NL
L3S
DBLP
FAO
Geopolitical
Ontology
GovUK
Impact
Indicators
Housing Starts
Deutsche
Biographie
StatusNet
Imirhil
StatusNet
ldnfai
StatusNet
Keuser
StatusNet
Russwurm
GovUK Societal
Wel being
Deprivation Imd
Crime Rank 2010
GovUK
Imd Income
Rank La
2010
StatusNet
Datenfahrt
Southampton
ac.uk
LOD2
Project
Wiki
DBpedia
KO
Dailymed
FU-Berlin
WALS
DBpedia
IT
StatusNet
Recit
Livejournal
StatusNet
Exdc
Elviajero
Aves3D
Open
Calais
Zaragoza
Turruta
Aspire
Manchester
Wordnet
(VU)
GovUK
Transparency
Impact Indicators
Neighbourhood
Plans
StatusNet
David
Haberthuer
B3Kat
Pub
Bielefeld
Prefix.cc
NALT
Vulnera-pedia
GovUK
Impact
Indicators
Affordable
Housing Starts
GovUK
Wel being lsoa
Happy
Yesterday
Mean
Flickr
Wrappr
Yso.fi
YSA
Open
Library
Aspire
Plymouth
StatusNet
Johndrink
Water
StatusNet
Gomertronic
Tags2con
Delicious
StatusNet
tl1n
StatusNet
Progval
Testee
World
Factbook
FU-Berlin
DBpedia
JA
StatusNet
Cooleysekula
Product
DB
IMF
270a.info
StatusNet
Postblue
StatusNet
Skilledtests
Nextweb
GNOSS
Eurostat
FU-Berlin
GovUK
Households
Social Let ings
General Needs
Let ings Prp
Household
Composition
StatusNet
Fcac
DWS
Group
Opendata
Scotland
Graph
Simd Rank
DNB
Clean
Energy
Data
Reegle
Opendata
Scotland Simd
Employment
Rank
Chronicling
America
GovUK
Societal
Wel being
Deprivation
Imd Rank 2010
StatusNet
Belfalas
Aspire
MMU
StatusNet
Legadolibre
Bluk
BNB
StatusNet
Lebsanft
GADM
Geovocab
GovUK
Imd Score
2010
Semantic
XBRL
UK
Postcodes
Geo
Names
EEARod
Aspire
Roehampton
BFS
270a.info
Camera
Deputati
Linked
Data
Bio2RDF
GeneID
GovUK
Transparency
Impact Indicators
Planning
Applications
Granted
StatusNet
Sweetie
Bel e
O'Reilly
GNI
City
Lichfield
GovUK
Imd
Rank 2010
Bible
Ontology
Idref.fr
StatusNet
Atari
Frosch
Dev8d
Nobel
Prizes
StatusNet
Soucy
Archiveshub
Linked
Data
Linked
Railway
Data
Project
FAO
270a.info
GovUK
Wel being
Worthwhile
Mean
Bibbase
Semantic-web.
org
British
Museum
Col ection
GovUK
Dev Local
Authority
Services
Code
Haus
Lingvoj
Ordnance
Survey
Linked
Data
Wordpress
Eurostat
RDF
StatusNet
Kenzoid
GEMET
GovUK
Societal
Wel being
Deprv. imd
Score '10
Mis
Museos
GNOSS
GovUK
Households
Projections
total
Houseolds
StatusNet
20100
EEA
Ciard
Ring
Opendata
Scotland Graph
Education
Pupils by
School and
Datazone
VIVO
Indiana
University
Pokepedia
Transparency
270a.info
StatusNet
Glou
GovUK
Homelessness
Households
Accommodated
Temporary
Housing Types
STW
Thesaurus
for
Economics
Debian
Package
Tracking
System
DBTune
Magnatune
NUTS
Geo-vocab
GovUK
Societal
Wel being
Deprivation Imd
Income Rank La
2010
BBC
Wildlife
Finder
StatusNet
Mystatus
Miguiad
Eviajes
GNOSS
Acorn
Sat
Data
Bnf.fr
GovUK
imd env.
rank 2010
StatusNet
Opensimchat
Open
Food
Facts
GovUK
Societal
Wel being
Deprivation Imd
Education Rank La
2010
LOD
ACBDLS
FOAF-Profiles
StatusNet
Samnoble
GovUK
Transparency
Impact Indicators
Affordable
Housing Starts
StatusNet
Enel Coreyavis
Shops
DBpedia
FR
StatusNet
Rainbowdash
StatusNet
Mamalibre
Princeton
Library
Findingaids
WWW
Foundation
Bio2RDF
OMIM
Resources
Opendata
Scotland Simd
Geographic
Access Rank
Gutenberg
StatusNet
Otbm
ODCL
SOA
StatusNet
Ourcoffs
Colinda
Web
Nmasuno
Traveler
StatusNet
Hackerposse
LOV
GovUK
wellb. happy
yesterday
std. dev.
StatusNet
Ludost
BBC
Program-mes
GovUK
Societal
Wel being
Deprivation Imd
Environment
Rank 2010
Bio2RDF
Taxonomy
Worldbank
270a.info
OSM
DBTune
Music-brainz
Linked
Mark
Mail
StatusNet
Deuxpi
GovUK
Transparency
Impact
Indicators
Housing Starts
Bizkai
Sense
GovUK
impact
indicators energy
ef iciency new
builds
StatusNet
Morphtown
GovUK
Transparency
Input indicators
Local authorities
Working w. tr.
Families
ISO 639
Oasis
Aspire
Portsmouth
Zaragoza
Datos
Abiertos
Opendata
Scotland
Simd
Crime Rank
Berlios
StatusNet
piana
GovUK
Net Add.
Dwellings
Bootsnall
StatusNet
chromic
Geospecies
linkedct
Wordnet
(W3C)
StatusNet
thornton2
StatusNet
mkuttner
StatusNet
linuxwrangling
Eurostat
Linked
Data
GovUK
societal
wellbeing
deprv. imd
rank '07
Linked
Open Data
of
Ecology
StatusNet
chickenkil er
StatusNet
gegeweb
Deusto
Tech
StatusNet
schiessle
GovUK
transparency
impact
indicators
tr. families
Taxon
concept
GovUK
service
expenditure
GovUK
societal
wellbeing
deprivation imd
employment
score 2010
Fariz Darari (UniBZ) Dec 5, 2014 6 / 19
11. Sieve's Approach
Pablo N. Mendes, Hannes Muhleisen, Christian Bizer: Sieve: Linked Data quality assessment and fusion.
EDBT/ICDT Workshops 2012.
Fariz Darari (UniBZ) Dec 5, 2014 11 / 19
13. Generalization: Query-Driven Data Quality
2-step query-driven Linked Data quality management:
We annotate parts of data with data quality aspects such as
completeness, correctness and timeliness.
Then, query answers are checked
if they reside inside the annotated parts.
Fariz Darari (UniBZ) Dec 5, 2014 13 / 19
15. Interrelations between Data Quality Aspects
There is already the query-driven data completeness framework1
How to generalize a model to support representation of data
correctness, data completeness, data timeliness and other quality
aspects over data sources,
and provide reasoning techniques for such interrelations?
1Fariz Darari, Werner Nutt, Giuseppe Pirro, Simon Razniewski: Completeness
Statements about RDF Data Sources and Their Use for Query Answering.
ISWC 2013.
Fariz Darari (UniBZ) Dec 5, 2014 15 / 19
16. Other Research Problems
Data Quality and Data Provenance
Data Completeness and Linked Data Streams
Data Completeness and SPARQL Queries
Data Quality and Data Privacy
Fariz Darari (UniBZ) Dec 5, 2014 16 / 19
18. Current Results
Timestamped completeness statements
Ecient techniques for completeness reasoning2
Completeness reasoning implementation3
RDF and SPARQL reconciliation4
2
To be submitted to the ACM TWEB
3
Fariz Darari, Radityo Eko Prasojo, Werner Nutt: CORNER: A Completeness Reasoner for SPARQL Queries
Over RDF Data Sources. ESWC 2014 Demo.
4
Fariz Darari, Simon Razniewski, Werner Nutt: Bridging the Semantic Gap between RDF and SPARQL
using Completeness Statements. ISWC 2014 Poster.
Fariz Darari (UniBZ) Dec 5, 2014 18 / 19
19. Query-Driven Management of Linked Data Quality
Take away messages:
Query-driven data quality approach is formal but
exible
to manage Linked Data quality.
Thank you!
Fariz Darari (UniBZ) Dec 5, 2014 19 / 19