Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
The possibilities and challenges of using linked
data for academic research
The case of the Talk of Europe project
Laura Hollink	 	 Centrum Wiskunde & Informatica, Amsterdam
Martijn Kleppe	 	 Erasmus University Rotterdam
Max Kemman	 	 University of Luxembourg
Astrid van Aggelen 	 VU University Amsterdam
Willem van Hage	 SynerScope, Helvoirt
European Parliament as Linked Data
• Goal: publish the plenary debates
of the European Parliament as
Linked Open Data
• Why is this important?
A. Large scale analysis across
time spans 

B. To residents of the European
Union access to the
proceedings of the European
parliament is a formal right.
• Linked Data: a format for publishing
data on the Web, with URI’s as
permanent identifiers, designed for
connecting pieces of data.
Data
14M statements about the 30K
speeches by 3K speakers in 1K
session days that were held in the EU
parliament between 1999 and 2014
Links
Country names
Members of Parliament
Members of Parliament
+ Parties Members of
Parliament
Online database with background
information about MEPs: “committee,
party group and delegation membership,
as well as leadership positions”
[An Automated Database of the European Parliament.
Bjørn Høyland, Indraneel Sircar, and Simon Hix, European
Union Politics 10(1):143-152, 2009.]
Example 1: speeches that contain a certain keyword
Query: all speeches that contain the phrase “open data”
…. So let us go for open data, let us
go for utilisation of all the instruments
available to that end! …..
…. but there too governments are
encouraging the use of open data to
increase transparency, accountability
and citizen participation ….
…. We already have many open data
projects in the Member States and
local authorities…..
Example 2: speeches that contain a certain keyword
by date
Mentions of 'human rights'
dates
Frequency
0200400600800
1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
Example 2: speeches that contain a certain keyword
by country
AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK
Mentions of 'human rights' by country
01000200030004000500060007000
Example 3: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Integrate data from
the EU parliament
with external datasets
What other knowledge do we have available?
GBP
Region
Population density
Neighbouring countries
Age
Religion
Education
Spouse / children
Previous occupations
Place of birth/residence
Speeches in the
Italian parliament
Membership of committees
Leadership positions
DEMO tomorrow
14:20-16:00
Discussion: What
happens if we use
linked data as source
data for research?
• •
Implications for use?
Credibility
• Who created it? How?
• The quality may vary:
• EP vs. Wikipedia
Completeness
• How complete is it? Is there a
way to tell how complete it is?
• Completeness may vary:
• EP vs. wikipedia
Update frequency
• When was the data last
updated?
• Update frequency may vary:
• EP vs. “An automated
database of the EP”
Credibility, completeness,
update frequency of the
links
• Who made them? How? When?
How complete are they?
Message:
the need for dataset
evaluation is exacerbated
when using linked data
How to use this data, in practice
The bad news: we don’t have a friendly user interface :’(

!
!
!
!
The good news: our data + all
sources we link to are openly
available for everyone :)

!
!
Options for use:
1. Tell us what you want to know
and we will write you a query. 

2. Go to our website, copy-paste an
example query into the query
editor.

3. Go to our website, write a
SPARQL query in the query editor

!
4. Query our SPARQL endpoint
programmatically.

!
Website: via http://talkofeurope.eu/
data/
Use of the data during three Creative Camps
• 3 events of one week each,
where people are invited to
work with our data on-site.

!
• Outcome CC #1 in Hilversum:

• Links to the Italian
parliament.

• Detection of people who
speak about an unusual
mix of topics.

• Sentiment analysis
Talk of Europe team
Martijn Kleppe Henri Beunders
Max Kemman Jill Briggeman
Astrid van Aggelen
Laura Hollink
Marnix van
Berchum

More Related Content

Talk of Europe @ DHBenelux2015

  • 1. The possibilities and challenges of using linked data for academic research The case of the Talk of Europe project Laura Hollink Centrum Wiskunde & Informatica, Amsterdam Martijn Kleppe Erasmus University Rotterdam Max Kemman University of Luxembourg Astrid van Aggelen VU University Amsterdam Willem van Hage SynerScope, Helvoirt
  • 2. European Parliament as Linked Data • Goal: publish the plenary debates of the European Parliament as Linked Open Data • Why is this important? A. Large scale analysis across time spans B. To residents of the European Union access to the proceedings of the European parliament is a formal right. • Linked Data: a format for publishing data on the Web, with URI’s as permanent identifiers, designed for connecting pieces of data.
  • 3. Data 14M statements about the 30K speeches by 3K speakers in 1K session days that were held in the EU parliament between 1999 and 2014
  • 4. Links Country names Members of Parliament Members of Parliament + Parties Members of Parliament Online database with background information about MEPs: “committee, party group and delegation membership, as well as leadership positions” [An Automated Database of the European Parliament. Bjørn Høyland, Indraneel Sircar, and Simon Hix, European Union Politics 10(1):143-152, 2009.]
  • 5. Example 1: speeches that contain a certain keyword Query: all speeches that contain the phrase “open data” …. So let us go for open data, let us go for utilisation of all the instruments available to that end! ….. …. but there too governments are encouraging the use of open data to increase transparency, accountability and citizen participation …. …. We already have many open data projects in the Member States and local authorities…..
  • 6. Example 2: speeches that contain a certain keyword by date Mentions of 'human rights' dates Frequency 0200400600800 1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
  • 7. Example 2: speeches that contain a certain keyword by country AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK Mentions of 'human rights' by country 01000200030004000500060007000
  • 8. Example 3: background info about the MEPs • MEPs that were not born in Europe. Members of Parliament Integrate data from the EU parliament with external datasets
  • 9. What other knowledge do we have available? GBP Region Population density Neighbouring countries Age Religion Education Spouse / children Previous occupations Place of birth/residence Speeches in the Italian parliament Membership of committees Leadership positions DEMO tomorrow 14:20-16:00
  • 10. Discussion: What happens if we use linked data as source data for research?
  • 11. • • Implications for use? Credibility • Who created it? How? • The quality may vary: • EP vs. Wikipedia Completeness • How complete is it? Is there a way to tell how complete it is? • Completeness may vary: • EP vs. wikipedia Update frequency • When was the data last updated? • Update frequency may vary: • EP vs. “An automated database of the EP” Credibility, completeness, update frequency of the links • Who made them? How? When? How complete are they? Message: the need for dataset evaluation is exacerbated when using linked data
  • 12. How to use this data, in practice The bad news: we don’t have a friendly user interface :’( ! ! ! ! The good news: our data + all sources we link to are openly available for everyone :) ! ! Options for use: 1. Tell us what you want to know and we will write you a query. 2. Go to our website, copy-paste an example query into the query editor. 3. Go to our website, write a SPARQL query in the query editor ! 4. Query our SPARQL endpoint programmatically. ! Website: via http://talkofeurope.eu/ data/
  • 13. Use of the data during three Creative Camps • 3 events of one week each, where people are invited to work with our data on-site. ! • Outcome CC #1 in Hilversum: • Links to the Italian parliament. • Detection of people who speak about an unusual mix of topics. • Sentiment analysis
  • 14. Talk of Europe team Martijn Kleppe Henri Beunders Max Kemman Jill Briggeman Astrid van Aggelen Laura Hollink Marnix van Berchum