Different tourists have different interests in traveling. When people visit the same country or city, such as Japan, most sites in this country will be visited, not just the famous sites. This phenomenon is in accordance with the data characteristic of trajectories in a vast travelogue, which appears intensive in all places. When we directly visualize these trajectory data, we find that the lines are staggered, and the points are scattered across all maps. To solve the line intersection, we used an edge bundling algorithm to reduce visual confusion.
In the geometry-based edge bundling method [
18], the control mesh guides edge bundling with similar edges. The Force-Directed Edge Bundling (FDEB) method [
19], which is based on the forced-directed algorithm, was also used to cluster the edge with the springs. However, traditional edge bundling algorithms lack multilevel interaction for massive data. Gansner et al. [
20] proposed a multilevel edge bundling algorithm based on the force-directed edge bundling algorithm. Their algorithm offers many parameters to interact with a massive graph in multilevel visualizations. The most important parameter for us is the parameter to resize the edge similarity. The largest value of this parameter can leads to flow maps of the graph. However, it is far from enough. The trajectories are composed of tourist attractions, restaurants, shopping destinations and other locations. They are messy data, which means we cannot directly explore the movement patterns. We require aggregation visualization methods to solve the problem. Thus, we propose three aggregation visualization methods to understand the trajectories in travelogues and help travel route planning. In the multilevel aggregation visualization, we need a good plotting scale. We designed an adaptive plotting scale algorithm to help us choose the cutoff point for multilevel visualization. Finally, it is common to understand the data from the time dimension and the space dimension, because of the characteristics of spatial-temporal data.
6.2. Multilevel Aggregation Procedure and Adaptive Plotting Scale
Data aggregation is any process in which information is gathered and expressed in a summary form for purposes such as statistical analysis. A common aggregation process is to get more information about particular groups based on specific variables such as age, profession or income. Aggregation visualization is a common method to reduce the visual confusion of big data. Chen et al. [
25] aggregated WeiBo in one place. They extract the number of visits, unique visitors and the average time interval of the movement in/out of a place.
Generally, aggregation of travelogue trajectory data leads to a sequence of geo-locations with corresponding attributes in spatio-temporal space. Formally, indicates a place with the name and attributes , e.g., {<Hawaii, (State) >}. After the basic statistical process, we can get other attributes of the place, such as visiting times. The aggregation of specific attribute transforms the value of these attributes into one category. For the trajectory sequences , we get after aggregation. For example, for a trajectory with <Honolulu,(City)>,<University of Hawaii,(Street) >,<Anchorage,(City)>, if we aggregate the trajectory on administrative level with State, we get Hawaii,<Honolulu,(City)»,<Hawaii,<University of Hawaii,(Street)»,<Alaska,<Anchorage,(City)».
The aggregation of data combines large data by one attribute, while multilevel aggregation aggregates at multiple levels. For a single trajectory sequence, we can obtain several sequences with different levels. Formally, for the trajectory sequences , we have after aggregation, where denotes category kwith level li. For example, for a trajectory with <Honolulu,(City)>,<University of Hawaii,(Street) >,<Anchorage,(City)>, with multilevel aggregation at the administrative level, we get the levels of State, City, Town, Street. Using the State administrative level, we get Hawaii,<Honolulu,(City)»,<Hawaii,<University of Hawaii,(Street)»,<Alaska,<Anchorage,(City)». Using the city administrative level: Honolulu,<Honolulu,(City)»,<Honolulu,<University of Hawaii,(Street)»,<Anchorage,<Anchorage,(City)».
The important part is how to find the level division. With regards to aggregation using attributes of the administrative level, it is convenient to map the location names to the administrative level. With regards to common attributes with a value, level division need a plotting scale with statistical analysis. Firstly, we get all the possible values of the attributes, then use the plotting scale to map them to a given range of categories. For example, for aggregation of the hottest site, the attribute might refer to the number of times a place occurs in the trajectories. We get all values of visiting times as a list and use a linear scale to put the value range into several categories.
Figure 3 is the flowchart that shows the multilevel aggregation procedure.
Traditional scales use a mathematical function to map an input domain to an output range. The choice of function is determined by the data distribution. However, the data do not always conform to any standard distribution. In this case, we need a method to find a good cutoff point according to the data distribution. This is when the data between the two cutoff points display similar characteristics. That principle is in accordance with the cluster method. We used the cluster method to choose the cutoff point in the multilevel case.
Figure 4 shows the procedure for adaptive scales; smoothing on the range length is applied after the cluster method.
6.4. Aggregation in the Space Dimension
About these spatial data, it is obvious that the administrative level is the most important view. Then, we take all the trajectories as the trajectory network on the map. Naturally, the point weight and the line weight are our concern. Therefore, we aggregate the data by hottest site and hottest trajectory.
6.4.1. Aggregation by Administrative Region
We found the trajectory data of travelogues, then aggregated them at the administrative region with two levels. The first level was for the country or city and the second for the district and town. By aggregating the trajectory data in this way, we could better understand the data. Most importantly, this method partly solved the problem of data variety. We sorted the spatial data from the administrative level, then we could explore the movement patterns at the same administrative level.
6.4.2. Aggregation by Hottest Site
We extracted trajectories from all travelogues, then sorted the location by its frequency, which we interpreted as its popularity. By taking all trajectories as the trajectory network on the map, we explored the population movement pattern from the spatial point of view. After aggregating the data by hottest site, we could explore the popularity of the site. Based on this visualization, we could better explore patterns for the site.
6.4.3. Aggregation in Hottest Trajectory
We extracted trajectories from all travelogues, then sorted the trajectories by frequency, which we took as a measure of popularity. Taking all trajectories together as the trajectory network on the map, we explored the movement patterns from the spatial view. After aggregating the data in the hottest trajectory, we explored the popularity of the trajectory. Based on this visualization, we could better explore the trajectory patterns.