Keywords

1 Introduction

The development of open data by local governments and data platforms for each field is progressing. These are broad ranged data on each area, such as traffic, disaster prevention, restaurants and services, and are expected to be useful information sources for citizens and tourists. On the other hand, these data are usually deployed in a network reachable place, but when they have to be handled individually according to its own format, and in some cases, conversion both in format and in semantics are required, which is a barrier to use.

By the way, in existing information services for tourists, especially smartphones application services for tourists, the content provided are selective and limited in some specific fields and target areas covered. That is, there is a problem on coverage in content. In addition, there are many cases where has a problem with the cost of maintaining and updating content, and the content may often be obsolete.

Given the existence of data platforms developed for each field, the paper presents the data linkage challenges that enable them to be integrated and used. And also the paper shows a one-stop smart city application that has been developed using that function. This application provides information for tourists during normal times, and it can handle the situation in town, such as congestion, in real time in order to grasp the flow and local stagnation that occur in bursts at the time of events or accidents. The application enables crowdsourcing to collect information on situations in town from users.

2 Background

2.1 Location-Based Information Services

A lot of network services with location data are proposed, and some of them, such as foursquareFootnote 1, are getting popular. Usually location information is given as geographical coordinates, that is, latitude and longitude, a location identifier such as ID for facilities in geographical information services (GIS), or a postal address. Google has launched Google PlacesFootnote 2, which gathers place information from active participating networkers and delivers such information through Google’s web site and API (application programmable interface). Google may try to grasp facts and information on activities in the real world where it has not enough information yet even though it seems to have become the omniscient giant in the cyber world. Google already captures some real world phenomena in its own materials. For example, it gathers landscape images with its own fleet of specially adapted cars for the Google Street View serviceFootnote 3. However, the cost of capturing and digitizing facts and activities in the real world is generally very expensive if you try to obtain more than capturing photo images with geographical information. Although Google Places may be one of the reasonable solutions to gathering information in the real world, it’s not guaranteed that it can grow into an effective and reliable source reflecting the real world.

Existing social information services, such as Facebook and Twitter, are expanding to attach location data to users’ content.

2.2 Open Data

Open Definition describes open data as “Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).” [10] Although the idea of open data has been around for a long time, the term “open data” has been used to refer specifically to the activities of open-data government initiatives, such as data.gov, data.go.uk, and data.go.jp, in recent years. To promote government transparency, accountability, and public participation, governments make information publicly available as machine-readable open data.

Linked Open Data (LOD) is Linked Data which is released under an open license, which does not impede its reuse for free [5]. Linked Data is structured data which is interlinked with other data to share information to enable to be processed semantically by computers. Tim Barners-Lee advocated the five star rating scheme of LOD as follows:

  1. 1.

    Available on the web (whatever format) but with an open licence, to be Open Data

  2. 2.

    Available as machine-readable structured data (e.g. excel instead of image scan of a table)

  3. 3.

    as (2) plus non-proprietary format (e.g. CSV instead of excel)

  4. 4.

    All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff

  5. 5.

    All the above, plus: Link your data to other people’s data to provide context

Usually, when considering the use of open data by machine processing, it is considered that three or more stars are required.

2.3 Crowdsourcing for Civil Problems

The term “crowdsourcing” was described by Jeff Howe in 2006 [7] and defined that crowdsourcing is the act of taking a task traditionally performed by a designated agent and outsourcing it by making an open call to an undefined but large group of people [8]. This can take the form of peer-production, but is also often undertaken by sole individuals [6].

The concept of smart cities can be viewed as a recognition of the growing importance of digital technologies for a competitive position and a sustainable future [11]. Although the smart city-agenda, which grants ICTs with the task to achieve strategic urban development goals such as improving the life quality of its citizens and creating sustainable growth, has gained a lot of momentum in recent years.

Tools such as smartphones offer the opportunity to facilitate co-creation between citizens and authority. Such tools have the potential to organize and stimulate communication between citizens and authority, and allow citizens to participate in the public domain [4, 12]. One example is FixMyStreetFootnote 4 that enables citizens to report broken streetlights and potholes [9]. It is important that these approaches will not succeed automatically and social standards like trust, openness, and consideration of mutual interests have to be guaranteed to make citizen engaging in the public domain challenging.

WazeFootnote 5 is another crowdsourcing service to collect data of traffic. Even though Waze provides users to traffic information collected from users and route navigation function, it seems not enough to motivate users to get involved in, because recommended routes are not as adequate as car navigation appliances, especially in Japan where such appliances are well-developed.

The authors have researched and proposed some crowdsourcing applications. One is an online driving recorder service to collect both sensor data and videos, recorded from the view of the driver; by using this application, users benefit from a free record of their driving, and the authors obtain large amounts of low-cost sensor data. Then, the authors estimate road surface conditions by analysing such collected sensor data [1, 3]. Another application can collect and share the location information of mobile phones cheaply in a public place like a bus [2].

3 Zap Sapporo: An LBS for Explorers in Town

3.1 Service Description

The service can be accessed via smartphone application. The authors have developed a locatiton-based service application called “Zap Sapporo”. This application was made available to the public in February 2020.

The service is designed for strollers who visit Sapporo area. Sapporo is the fifth largest city in Japan, with a population of about 1.91 million. The city receives an average of about 6 meters snowfall annually, with an average maximum snow depth of about 1 m in February. The city spends more than 15 billion yen every winter on road management activities such as snowplowing and snow removal. Sapporo appears to be one of the most “challenged” cities in the world because its citizens demand good facilities and services even though the climate is severe. Sapporo has almost twice as much snowfall as Quebec City, while its population is nearly four times larger, and is increasing.

When visitors arrive in the service area and access the service, they can get information about scheduled events and integrated contents.

Zap Sapporo is more than just LBS, it is an application that enables crowdsourcing, especially sharing real-time city conditions.

Major functions of the service are as follows.

3.2 User Experience

The application screen is composed of the following tabs, and the user switches the desired item by selecting the tab.

  • Current Event

  • Restaurant

  • Transit

  • Bear

  • Hospital

  • Other

The application can select English, Chinese (Traditional and Simplified), and Korean in addition to Japanese, and content according to the set language is provided.

Fig. 1.
figure 1

Snapshot images of current events in the Zap Sapporo application

Fig. 2.
figure 2

Snapshot images of posting a tweet on current congestion status in the Zap Sapporo application

Fig. 3.
figure 3

A scene of an event in Sapporo in the winter of 2020

Current Event. The current event is displayed on the application home page. This page contains information about the event and useful links to external sites, such as the official website.

Along with receiving the event information, the user can transmit the event information. Such user-generated information can be shared in the page. Figure 1 shows an example of current event tab. This example contains public viewing information about an international sporting event held in Sapporo in the fall of 2019. In the city center, several public screens have been deployed to show the match live. The page includes such venue list of public viewing screens with the last congestion status posted by users. In this example, all venues are fully packed, but the length of the bar is different because the length indicates its capacity.

Users can post the congestion status of the venue where they are staying, tapping the blue button in the lower right corner. After tapping the button, the venue selector appears (Fig. 2(a)). Then, users are requested to select a category of congestion degree (Fig. 2(b)). Given categories are as follows:

  • vacant

  • has room

  • crowded

  • full

  • so many people

  • cannot move

Users can choose one of them subjectively. Finally, they add comments and then complete posting the tweet (Fig. 2(c)).

Posted tweets including congestion status are processed automatically and the information in the application is updated as shown in Fig. 1. Figure 3 shows the scene of another event in Sapporo in the winter of 2020.

Restaurant. The second tab is for restaurants and bars. The list of restaurants can be sorted by distance from the user’s location. Users can also filter the list by specified tags that correspond to categories. Figure 4 shows the list of restaurants sorted by the distance order.

The restaurant information provided here uses information integrated from several information sources such as Sapporo City, Sapporo Tourism Association, and the Sapporo Chamber of Commerce and Industry. The integrated content used here is described in Sect. 4.

Transit. Information such as real-time traffic information and transfer guides may be important for visitors in the city. Transit tab shows the transit information. Current traffic information are displayed on the top. This traffic information is provided at Data Smart City Sapporo (DSCS), the open data platform of Sapporo City. The information is updated every 10 min.

Six major transports, subway, street car, Japan Railway (JR), bus, taxi, and rental bicycle, are displayed in the center of the page (Fig. 5(a)). The subway sub page displays a route map and timetables that are also provided at DSCS (Fig. 5(b)).

Others. The Zap Sapporo application has three more tabs: Bear, Hospital, and Other. Bear includes information on the appearance of wild bears collected at the city hall. The spot where the bear was sighted is plotted on the map. The Hospital tab displays a list of hospitals in the Sapporo area collected by the Hokkaido government.

The “Other” tab provides miscellaneous information, especially for foreign visitors (Fig. 6). Public Wi-Fi spots, restrooms in public parks, and foreign diplomatic locations are provided (Fig. 6(a)). For example, a list of consulates of oversea countries located in Sapporo is provided. Almost all of the information on the list has been taken from the Ministry of Foreign Affairs of Japan website.

Fig. 4.
figure 4

Snapshot images of restaurant list in the Zap Sapporo application

3.3 Sensing Functions

User Data. The Zap Sapporo service collects the following user attributes:

  • nationality

  • country of residence

  • purpose of visiting Japan

Onboard Location and Motion Sensors. The Zap Sapporo application retrieves position and movement data from onboard sensors even in background mode. The collected data is pooled in a local data store and sent to a log server.

Collected data are as follows:

  • Location (latitude, longitude, altitude, horizontal/vertical accuracies)

  • Heading (magnetic, true, accuracy)

  • Move (course, speed)

The collected data will be used for city dynamic analysis.

Fig. 5.
figure 5

Snapshot images of transit information in the Zap Sapporo application

4 Integration of Distributed Contents

4.1 Issus of Integration

In the conventional LBS application that provides information for visitos, such as tourists, contents are individually prepared and maintained. However, individual content development has the following problems:

  • Aggregating a lot of content and maintaining it is costly.

  • Data provided by individual content providers only covers some stores, and coverage is low.

  • Content provided by individual content providers is heterogeneous and not easy to integrate.

Generally, various types of data are managed and provided in an autonomous distributed manner. From the viewpoint of data linkage, the same data has different formats, structures, and expressions, and is an obstacle to cross-use. At present, data that should be connected is not connected. There are the following issues regarding data linkage:

  • a large variety of data, a large amounts of data

  • distributed management of data

  • data written in different ways

    • the same data in different formats

    • the same data in different structures

    • the same data in different expressions

Here, the authors focus particularly on the third issue. The difference between formats may correspond to 2-star class of five star rating scheme of LOD shown in Sect. 2.2, while the difference between structures is regarded as 3-star class. In order to actually use different data by computational integration, a match at the expression level, that is, a 4-star class is required. In particular, when trying to use data in different fields, the use of words is generally quite different, and there are many technical issues in realizing these integration.

This paper will try to integrate content from various information sources, targeting the disambiguation at the level of attribute labels, and make them available through Zap Sapporo.

Fig. 6.
figure 6

Snapshot images of miscellaneous information in the Zap Sapporo application

4.2 Integration of Restaurant Contents of Sapporo

Restaurant contents used in Zap Sapporo are realized by integrating contents from multiple information sources. One of fundamental sources is Sapporo City. It delivers List of Food Business Permits in DSCS. This list can be considered as a kind of a complete list of restaurants in the city, as it is necessary to obtain this permission to legally operate a restaurant in the city. Unfortunately, the list consists of few attributes, such as name of the operator and postal address.

Another source is Sapporo Tourism Association, a non-profit organization that promotes tourism in Sapporo. It operates “Yokoso Sapporo (welcome to Sapporo)”, an official website of Sapporo tourism, and also produces leaflets and guidebooks for visitors. In addition, it provides a smartphone application called “Sapporo Gourmet Coupon”.

The Sapporo Chamber of Commerce and Industry also has some lists of restaurants to promote local business. One is a restaurant list in English, another is a list of confectionaries in greater Sapporo area.

Table 1 shows data to be integrated.

Table 1. Data collections

Sapporo Gourment Coupon and Yokoso Sapporo contains not many attribures but rich description of restaurants. Tourist friendly restaurant has usuful attributes for foreign visitors, such as available languages and cashless payment. Night map is unique because they contain many restaurants that are different from other collections, but there are challenges in integrating items such as simplified names and addresses.

At this stage, this paper has manually integrated these collections. As a result, it was found that there was a lot of problems with the accuracy of name matching on characters, and that the address was not perfect because it contained spelling errors and mistakes. Intuitively, phone numbers seem to be the most consistent and dependable, that is, useful for matching.

For example, “Ramen SORA”, a ramen restaurant, is appeared in three collections: List of Food Business Permits, Sapporo Gourmet Coupon, and night map. For name, two variations are shown: “ ” and “ ”. Both “ ” and “ ” are phonetic characters, so these two can be easily integrated automatically. Each expression of postal address is different: “ ”, “ ”, and “ ”. The third expression is succinct and rough. Matching may be probabilistic. Phone number is contained in Sapporo Gourmet Coupon and night map. The two numbers are exactly the same.

Another example is an okonomiyaki restaurant. Its name in the List of Food Business Permit is “ ”, while the name in tourist friendly restaurants is “ ” which is decorated to attract attention. Postal address in tourist friendly restaurants is rough and in English, “060-0806 West exit PASEO 1F, Kita 6, Nishi 4, Kita-ku, Sapporo”.

Based on the above analysis, a heuristic-based matching program has enabled some integration of collections. In the future, the authors aim at automatic and high-accuracy integration by natural language processing technologies and machine learning approaches.

5 Conclusions

This paper shows a one-stop smart city application that has been developed using that function. This application provides information for tourists during normal times, and it can handle the situation in town, such as congestion, in real time in order to grasp the flow and local stagnation that occur in bursts at the time of events or accidents.

Given the existence of data platforms developed for each field, we present the data linkage challenges that enable them to be integrated and used.

The authors continues to develop and provide both our application and data linkage platform. To evaluate the effectiveness of the proposed systems, experiments are being planned.