Presentation at Digital Humanities Benelux 2015, Antwerp, Belgium: The possibilities and challenges of using linked data for academic research: the case of the Talk of Europe project. linked data for academic research: the case of the Talk of Europe project. Laura Hollink, Martijn Kleppe, Max Kemman, Astrid van Aggelen, Willem Robert Van Hage.
1 of 14
More Related Content
Talk of Europe @ DHBenelux2015
1. The possibilities and challenges of using linked
data for academic research
The case of the Talk of Europe project
Laura Hollink Centrum Wiskunde & Informatica, Amsterdam
Martijn Kleppe Erasmus University Rotterdam
Max Kemman University of Luxembourg
Astrid van Aggelen VU University Amsterdam
Willem van Hage SynerScope, Helvoirt
2. European Parliament as Linked Data
• Goal: publish the plenary debates
of the European Parliament as
Linked Open Data
• Why is this important?
A. Large scale analysis across
time spans
B. To residents of the European
Union access to the
proceedings of the European
parliament is a formal right.
• Linked Data: a format for publishing
data on the Web, with URI’s as
permanent identifiers, designed for
connecting pieces of data.
3. Data
14M statements about the 30K
speeches by 3K speakers in 1K
session days that were held in the EU
parliament between 1999 and 2014
4. Links
Country names
Members of Parliament
Members of Parliament
+ Parties Members of
Parliament
Online database with background
information about MEPs: “committee,
party group and delegation membership,
as well as leadership positions”
[An Automated Database of the European Parliament.
Bjørn Høyland, Indraneel Sircar, and Simon Hix, European
Union Politics 10(1):143-152, 2009.]
5. Example 1: speeches that contain a certain keyword
Query: all speeches that contain the phrase “open data”
…. So let us go for open data, let us
go for utilisation of all the instruments
available to that end! …..
…. but there too governments are
encouraging the use of open data to
increase transparency, accountability
and citizen participation ….
…. We already have many open data
projects in the Member States and
local authorities…..
6. Example 2: speeches that contain a certain keyword
by date
Mentions of 'human rights'
dates
Frequency
0200400600800
1999 2000 2001 2003 2004 2005 2006 2007 2009 2010 2011 2012 2013
7. Example 2: speeches that contain a certain keyword
by country
AT BE BG CY CZ DE DK EE ES FI FR GB GR HR HU IE IT LT LU LV MT NL PL PT RO SE SI SK
Mentions of 'human rights' by country
01000200030004000500060007000
8. Example 3: background info about the MEPs
• MEPs that were not born in Europe.
Members of Parliament
Integrate data from
the EU parliament
with external datasets
9. What other knowledge do we have available?
GBP
Region
Population density
Neighbouring countries
Age
Religion
Education
Spouse / children
Previous occupations
Place of birth/residence
Speeches in the
Italian parliament
Membership of committees
Leadership positions
DEMO tomorrow
14:20-16:00
11. • •
Implications for use?
Credibility
• Who created it? How?
• The quality may vary:
• EP vs. Wikipedia
Completeness
• How complete is it? Is there a
way to tell how complete it is?
• Completeness may vary:
• EP vs. wikipedia
Update frequency
• When was the data last
updated?
• Update frequency may vary:
• EP vs. “An automated
database of the EP”
Credibility, completeness,
update frequency of the
links
• Who made them? How? When?
How complete are they?
Message:
the need for dataset
evaluation is exacerbated
when using linked data
12. How to use this data, in practice
The bad news: we don’t have a friendly user interface :’(
!
!
!
!
The good news: our data + all
sources we link to are openly
available for everyone :)
!
!
Options for use:
1. Tell us what you want to know
and we will write you a query.
2. Go to our website, copy-paste an
example query into the query
editor.
3. Go to our website, write a
SPARQL query in the query editor
!
4. Query our SPARQL endpoint
programmatically.
!
Website: via http://talkofeurope.eu/
data/
13. Use of the data during three Creative Camps
• 3 events of one week each,
where people are invited to
work with our data on-site.
!
• Outcome CC #1 in Hilversum:
• Links to the Italian
parliament.
• Detection of people who
speak about an unusual
mix of topics.
• Sentiment analysis
14. Talk of Europe team
Martijn Kleppe Henri Beunders
Max Kemman Jill Briggeman
Astrid van Aggelen
Laura Hollink
Marnix van
Berchum