A semantic portal was developed to allow different user groups to access the data easily. The user groups include researchers, students, and the wider public interested in either the Finnish Civil War in general or the fates of their relatives. The portal is also used as the main exploration tool of the data by the maintainers of the data at the National Archives of Finland.
4.1 Using the Portal
The user interface is built around the concept of faceted search [
4,
25]. With faceted search, the user can easily narrow the search step by step by making selections based on predetermined orthogonal hierarchies of property values called facets. Facets also show the number of available items with each possible selection. This allows the user to immediately see the number of solutions of each possible selection. Combined with selections on other facets, such as occupation, party, and age, the user may also draw interesting conclusions by observing the hit distributions on the facets. Faceted search can therefore be used to not only find individuals that fit certain criteria, such as relatives, but it can also be used to find information about the distributions of different kind of the casualties. The faceted search paradigm is an example of exploratory search.
The user interface currently includes two main perspectives for exploring the underlying knowledge graph: (1) The main perspective is based on searching and exploring the casualties. (2) There is also a perspective based on the battles of the Finnish Civil War, covering currently 1,182 geo-coded battles. Other views may be added later in the same way as in other “Sampo” series semantic portals.
30 At the front page of the portal, the user is presented with links to the different perspectives. The perspectives can also be navigated using the to bar of the application.
The user can navigate to the War Victims perspective by clicking the picture representing this perspective. There the user is presented with facets that can be used to filter the data on the left, and results view on the right. The user can switch the result view using tabs above the view. As default the user is presented with a table view of the war victims in the selection. In Figure
3, selections have been made in the facets so that the group has been narrowed down to 78 people that are shown as a list.
The user can make selections facets and the selection is updated after each selection. Facets also show the number of hits available for each selection. This can be used to immediately see relative numbers in the data. For example, if no other selections are made, then the selection Tampere in birthplace facet shows the number 458 after it. This means that there are currently 458 people who were born in Tampere in the data. Selections in other facets may change this number, as the selected group changes. For example, after selecting the Reds from the party facet, this number would change to 357. This reflects that there are currently in the data 357 war victims who were born in Tampere and who supported the Red side in the civil war.
The facets include text facets, checkbox facets, date facets and slider facets. The free text tool can be used to search with the name of the victim, and this may be the most common search tool used by the public. Check box facets are used to filter ontologized values such as the party in the Civil War. Date facets van be used to filter based on death or birth date, and slider facets can be used to filter based on numeric information such as age or the number of children. For example, in Figure
3 the user has selected Helsinki from the birthplace facet and selected May 1918 as a time interval from the death date facet.
The name of the person in the table view has a link to the individual landing pages of each victim. These pages show the detailed information relating to the person, including the sources of each individual piece of information. Sometimes there can be conflicting information about certain thing in the different sources, and these conflicting pieces of information are shown as a list with all their separate sources.
The user can switch the result view using the tabs above the result view. In addition to the default table view, the War Victim perspective includes a pie chart view, line chart view, a map view of death places, and an option to download the selection as a CSV file.
When selecting the pie chart view, the user is presented with a pie chart visualization of the current selection. The pie chart visualizes the relative numbers of different values of within a certain type of information. As default the chart visualizes the relative numbers of victims from different parties of the Civil War. Without any selections in the facets the user can see here that 71% of the victims in the data supported the red side. By making selections in the facets, the user can narrow down the group to, for example, people from a certain town. This visualized value can be changed from the menu. The pie chart can visualize, for example, gender, occupation, or manner of death distributions.
The line chart view shows a line chart visualization based on the current selection. The distribution of death dates shown in Figure
1 was created using this visualization view. Currently there are three different options for this visualization: age at death, birth year, and death date. For example, the age at death option automatically draws a line chart where the x axis represents age in years and y axis represents the number of victims with that age at death. The average and median values are also shown under the visualization. The user can then easily and quickly compare distributions and average values between certain subsets of people in the data. For example, in Figure
4, we have used the facets to select a subset of victims that represents those that supported the Red side in the Civil War and were registered to the Viipuri province. Comparing this distribution to people from other provinces shows that war victims who supported the Red side of the Civil from the Viipuri Province tended to be older than others according to the data, with median age over thirty. Explaining this would require more detailed analysis, but this demonstrates how faceted search combined with simple analysis tools can be effective at finding interesting phenomenons in the data.
The map view shows the death places of the victims in selection on a map. This map is clustered so that nearby places are grouped together depending on the zoom level. Each cluster shows the number of victims that died in that area. The death places are shown on a municipality level. If there is no death municipality data, then a victim’s information is not shown on the map.
The final option on the right is the CSV option that allows for downloading information of the selected war victims as a CSV file, where one row corresponds to a one victim. This feature was considered important and was requested by history researchers. This allows the users to easily download raw data to use with their own analysis tools. Not all the data is however presented in this CSV file, as it is difficult to condense all the aspects of the knowledge graph to a single table. For example, metadata such the information sources is left out from this file.
The other perspective of the portal is the Battles of The Finnish Civil War portal. This works in similar way as the war victims perspective. The user can search and filter the battles using facets and the results can be shown as table, map, or an animation. This animation view is unique to the battles. It shows battle sites at different times on a map. The marker appear on a map as a red marker when animation reaches the starting date of the battle and it then stays on the map as grey marker when the animation progresses in time. Figure
5 shows the situation in March of 1918. You can see how a clear front has formed across Finland and battles are going on along it. This view is mainly aimed for educational purposes.
Even though the data can be accessed by anyone with SPARQL queries this can be too technically demanding for many users. Even a researcher who is able to create her own SPARQL queries can find it useful to have an easy way to explore the data and to create simple visualizations quickly. The visualization tools provided by the portal are expected to be useful for both finding new data and for educating the public about history. These tools should not be expected to fully replace manual research and close reading. They are aimed to be used to spot interesting phenomena in the data that require more detailed analysis.
4.2 Technical Implementation
This subsection presents the how user interface
31 of
WarVictimSampo 1914–1922 was implemented using the Sampo-UI framework
32 [
14]. Sampo-UI provides software developers with a ready-to-use basis for a user interface of a semantic portal, which needs only minor modifications for deploying as a modern JavaScript web application into production.
Figure
6 presents the overall architecture of the
WarVictimSampo 1914–1922 portal, provided by the Sampo-UI framework. The main parts are (1) a client based on the widely used and established React
33 and Redux
34 libraries, and (2) a Node.js
35 backend built with Express framework.
36 The primary task of the client is to display data to the end-user, and react to user’s selections. The business logic of fetching the data using various search paradigms is placed on the backend. The client sends API requests to the backend, which queries the SPARQL endpoint and processes and merges the response rows into arrays of potentially nested JavaScript objects.
For geographic visualizations, the client is integrated with external vector and raster data sources. In
WarVictimSampo 1914–1922 a raster version and a vector version of the the Mapbox Light basemap
37 is used, depending on the requirements set by the underlying geographical visualization library.
The development of the user interface started by forking the Sampo-UI GitHub repository. Sampo-UI provides a pre-configured environment for full stack JavaScript development. Babel
38 is used for converting the latest features of JavaScript, such as arrow functions and the async/await syntax, into a backwards compatible version of the language for current and older browsers. Webpack
39 handles the automatically restarting development server for the client, and bundling all source code and dependencies into static assets. The Node.js backend is run concurrently with the client, and is automatically restarted using Nodemon
40 when the source code is changed. Uniform coding style is enforced by using the JavaScript Standard Style
41 package.
The user interface of
WarVictimSampo 1914–1922 was developed on the basis of the default structure and components provided by Sampo-UI. The three main views, the landing page of the portal, faceted search perspective, and entity landing page, are presented in Figure
7.
For implementing these three main views, the Sampo-UI framework provides the developer with approximately 120 ready-to-use user interface components. The portal landing page component was configured to display links to two faceted search perspectives: War Victims and Battles. The faceted search perspectives were implemented using a combination of Sampo-UI’s facet and result set visualization components. Entity landing pages were created for war victims and information sources by extending the general entity landing page component of Sampo-UI. The extension was needed for showing the sources for different pieces of information.
From a technical perspective, the sustainability of the WarVictimSampo 1914–1922 portal is fostered by the open source code, extensive documentation, and modular architecture of Sampo-UI. A significant part of the logic related to various search paradigms and the processing of search results are carried out in the backend using pure JavaScript, instead of integrating this functionality inside client-side frameworks or libraries, which are known to become deprecated considerably sooner than pure JavaScript code. The core functionality provided by the framework includes robust patterns and tools for processing the data and delivering it to the components in a predictable and uniform way, which makes is straightforward to include new features and result set visualizations in the future.