Search | arXiv e-print repository

Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation

Authors: Jiawei Wang, Renhe Jiang, Chuang Yang, Zengqing Wu, Makoto Onizuka, Ryosuke Shibasaki, Noboru Koshizuka, Chuan Xiao

Abstract: This paper introduces a novel approach using Large Language Models (LLMs) integrated into an agent framework for flexible and effective personal mobility generation. LLMs overcome the limitations of previous models by effectively processing semantic data and offering versatility in modeling various tasks. Our approach addresses three research questions: aligning LLMs with real-world urban mobility… ▽ More This paper introduces a novel approach using Large Language Models (LLMs) integrated into an agent framework for flexible and effective personal mobility generation. LLMs overcome the limitations of previous models by effectively processing semantic data and offering versatility in modeling various tasks. Our approach addresses three research questions: aligning LLMs with real-world urban mobility data, developing reliable activity generation strategies, and exploring LLM applications in urban mobility. The key technical contribution is a novel LLM agent framework that accounts for individual activity patterns and motivations, including a self-consistency approach to align LLMs with real-world activity data and a retrieval-augmented strategy for interpretable activity generation. We evaluate our LLM agent framework and compare it with state-of-the-art personal mobility generation approaches, demonstrating the effectiveness of our approach and its potential applications in urban mobility. Overall, this study marks the pioneering work of designing an LLM agent framework for activity generation based on real-world human activity data, offering a promising tool for urban mobility analysis. △ Less

Submitted 23 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: Source codes are available at https://github.com/Wangjw6/LLMob/

arXiv:2309.14216 [pdf, other]

doi 10.1145/3583780.3614962

MemDA: Forecasting Urban Time Series with Memory-based Drift Adaptation

Authors: Zekun Cai, Renhe Jiang, Xinyu Yang, Zhaonan Wang, Diansheng Guo, Hiroki Kobayashi, Xuan Song, Ryosuke Shibasaki

Abstract: Urban time series data forecasting featuring significant contributions to sustainable development is widely studied as an essential task of the smart city. However, with the dramatic and rapid changes in the world environment, the assumption that data obey Independent Identically Distribution is undermined by the subsequent changes in data distribution, known as concept drift, leading to weak repl… ▽ More Urban time series data forecasting featuring significant contributions to sustainable development is widely studied as an essential task of the smart city. However, with the dramatic and rapid changes in the world environment, the assumption that data obey Independent Identically Distribution is undermined by the subsequent changes in data distribution, known as concept drift, leading to weak replicability and transferability of the model over unseen data. To address the issue, previous approaches typically retrain the model, forcing it to fit the most recent observed data. However, retraining is problematic in that it leads to model lag, consumption of resources, and model re-invalidation, causing the drift problem to be not well solved in realistic scenarios. In this study, we propose a new urban time series prediction model for the concept drift problem, which encodes the drift by considering the periodicity in the data and makes on-the-fly adjustments to the model based on the drift using a meta-dynamic network. Experiments on real-world datasets show that our design significantly outperforms state-of-the-art methods and can be well generalized to existing prediction backbones by reducing their sensitivity to distribution changes. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted by CIKM 2023

arXiv:2307.10609 [pdf, other]

Hybrid Feature Embedding For Automatic Building Outline Extraction

Authors: Weihang Ran, Wei Yuan, Xiaodan Shi, Zipei Fan, Ryosuke Shibasaki

Abstract: Building outline extracted from high-resolution aerial images can be used in various application fields such as change detection and disaster assessment. However, traditional CNN model cannot recognize contours very precisely from original images. In this paper, we proposed a CNN and Transformer based model together with active contour model to deal with this problem. We also designed a triple-bra… ▽ More Building outline extracted from high-resolution aerial images can be used in various application fields such as change detection and disaster assessment. However, traditional CNN model cannot recognize contours very precisely from original images. In this paper, we proposed a CNN and Transformer based model together with active contour model to deal with this problem. We also designed a triple-branch decoder structure to handle different features generated by encoder. Experiment results show that our model outperforms other baseline model on two datasets, achieving 91.1% mIoU on Vaihingen and 83.8% on Bing huts. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2307.04101 [pdf, other]

Enhancing Building Semantic Segmentation Accuracy with Super Resolution and Deep Learning: Investigating the Impact of Spatial Resolution on Various Datasets

Authors: Zhiling Guo, Xiaodan Shi, Haoran Zhang, Dou Huang, Xiaoya Song, Jinyue Yan, Ryosuke Shibasaki

Abstract: The development of remote sensing and deep learning techniques has enabled building semantic segmentation with high accuracy and efficiency. Despite their success in different tasks, the discussions on the impact of spatial resolution on deep learning based building semantic segmentation are quite inadequate, which makes choosing a higher cost-effective data source a big challenge. To address the… ▽ More The development of remote sensing and deep learning techniques has enabled building semantic segmentation with high accuracy and efficiency. Despite their success in different tasks, the discussions on the impact of spatial resolution on deep learning based building semantic segmentation are quite inadequate, which makes choosing a higher cost-effective data source a big challenge. To address the issue mentioned above, in this study, we create remote sensing images among three study areas into multiple spatial resolutions by super-resolution and down-sampling. After that, two representative deep learning architectures: UNet and FPN, are selected for model training and testing. The experimental results obtained from three cities with two deep learning models indicate that the spatial resolution greatly influences building segmentation results, and with a better cost-effectiveness around 0.3m, which we believe will be an important insight for data selection and preparation. △ Less

Submitted 9 July, 2023; originally announced July 2023.

arXiv:2306.14857 [pdf, other]

doi 10.1007/978-3-031-26422-1_28

Metapopulation Graph Neural Networks: Deep Metapopulation Epidemic Modeling with Human Mobility

Authors: Qi Cao, Renhe Jiang, Chuang Yang, Zipei Fan, Xuan Song, Ryosuke Shibasaki

Abstract: Epidemic prediction is a fundamental task for epidemic control and prevention. Many mechanistic models and deep learning models are built for this task. However, most mechanistic models have difficulty estimating the time/region-varying epidemiological parameters, while most deep learning models lack the guidance of epidemiological domain knowledge and interpretability of prediction results. In th… ▽ More Epidemic prediction is a fundamental task for epidemic control and prevention. Many mechanistic models and deep learning models are built for this task. However, most mechanistic models have difficulty estimating the time/region-varying epidemiological parameters, while most deep learning models lack the guidance of epidemiological domain knowledge and interpretability of prediction results. In this study, we propose a novel hybrid model called MepoGNN for multi-step multi-region epidemic forecasting by incorporating Graph Neural Networks (GNNs) and graph learning mechanisms into Metapopulation SIR model. Our model can not only predict the number of confirmed cases but also explicitly learn the epidemiological parameters and the underlying epidemic propagation graph from heterogeneous data in an end-to-end manner. The multi-source epidemic-related data and mobility data of Japan are collected and processed to form the dataset for experiments. The experimental results demonstrate our model outperforms the existing mechanistic models and deep learning models by a large margin. Furthermore, the analysis on the learned parameters illustrate the high reliability and interpretability of our model and helps better understanding of epidemic spread. In addition, a mobility generation method is presented to address the issue of unavailable mobility data, and the experimental results demonstrate effectiveness of the generated mobility data as an input to our model. △ Less

Submitted 27 June, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: This is the extended version of an ECMLPKDD2022 paper

Journal ref: Machine Learning and Knowledge Discovery in Databases, Vol. 13718, pp. 453-468 (2023)

arXiv:2306.13875 [pdf, other]

Real-World Video for Zoom Enhancement based on Spatio-Temporal Coupling

Authors: Zhiling Guo, Yinqiang Zheng, Haoran Zhang, Xiaodan Shi, Zekun Cai, Ryosuke Shibasaki, Jinyue Yan

Abstract: In recent years, single-frame image super-resolution (SR) has become more realistic by considering the zooming effect and using real-world short- and long-focus image pairs. In this paper, we further investigate the feasibility of applying realistic multi-frame clips to enhance zoom quality via spatio-temporal information coupling. Specifically, we first built a real-world video benchmark, VideoRA… ▽ More In recent years, single-frame image super-resolution (SR) has become more realistic by considering the zooming effect and using real-world short- and long-focus image pairs. In this paper, we further investigate the feasibility of applying realistic multi-frame clips to enhance zoom quality via spatio-temporal information coupling. Specifically, we first built a real-world video benchmark, VideoRAW, by a synchronized co-axis optical system. The dataset contains paired short-focus raw and long-focus sRGB videos of different dynamic scenes. Based on VideoRAW, we then presented a Spatio-Temporal Coupling Loss, termed as STCL. The proposed STCL is intended for better utilization of information from paired and adjacent frames to align and fuse features both temporally and spatially at the feature level. The outperformed experimental results obtained in different zoom scenarios demonstrate the superiority of integrating real-world video dataset and STCL into existing SR models for zoom quality enhancement, and reveal that the proposed method can serve as an advanced and viable tool for video zoom. △ Less

Submitted 24 June, 2023; originally announced June 2023.

Comments: 11 pages

arXiv:2301.05336 [pdf, other]

Multitask Weakly Supervised Learning for Origin Destination Travel Time Estimation

Authors: Hongjun Wang, Zhiwen Zhang, Zipei Fan, Jiyuan Chen, Lingyu Zhang, Ryosuke Shibasaki, Xuan Song

Abstract: Travel time estimation from GPS trips is of great importance to order duration, ridesharing, taxi dispatching, etc. However, the dense trajectory is not always available due to the limitation of data privacy and acquisition, while the origin destination (OD) type of data, such as NYC taxi data, NYC bike data, and Capital Bikeshare data, is more accessible. To address this issue, this paper starts… ▽ More Travel time estimation from GPS trips is of great importance to order duration, ridesharing, taxi dispatching, etc. However, the dense trajectory is not always available due to the limitation of data privacy and acquisition, while the origin destination (OD) type of data, such as NYC taxi data, NYC bike data, and Capital Bikeshare data, is more accessible. To address this issue, this paper starts to estimate the OD trips travel time combined with the road network. Subsequently, a Multitask Weakly Supervised Learning Framework for Travel Time Estimation (MWSL TTE) has been proposed to infer transition probability between roads segments, and the travel time on road segments and intersection simultaneously. Technically, given an OD pair, the transition probability intends to recover the most possible route. And then, the output of travel time is equal to the summation of all segments' and intersections' travel time in this route. A novel route recovery function has been proposed to iteratively maximize the current route's co occurrence probability, and minimize the discrepancy between routes' probability distribution and the inverse distribution of routes' estimation loss. Moreover, the expected log likelihood function based on a weakly supervised framework has been deployed in optimizing the travel time from road segments and intersections concurrently. We conduct experiments on a wide range of real world taxi datasets in Xi'an and Chengdu and demonstrate our method's effectiveness on route recovery and travel time estimation. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2207.03575 [pdf, other]

Online Trajectory Prediction for Metropolitan Scale Mobility Digital Twin

Authors: Zipei Fan, Xiaojie Yang, Wei Yuan, Renhe Jiang, Quanjun Chen, Xuan Song, Ryosuke Shibasaki

Abstract: Knowing "what is happening" and "what will happen" of the mobility in a city is the building block of a data-driven smart city system. In recent years, mobility digital twin that makes a virtual replication of human mobility and predicting or simulating the fine-grained movements of the subjects in a virtual space at a metropolitan scale in near real-time has shown its great potential in modern ur… ▽ More Knowing "what is happening" and "what will happen" of the mobility in a city is the building block of a data-driven smart city system. In recent years, mobility digital twin that makes a virtual replication of human mobility and predicting or simulating the fine-grained movements of the subjects in a virtual space at a metropolitan scale in near real-time has shown its great potential in modern urban intelligent systems. However, few studies have provided practical solutions. The main difficulties are four-folds. 1) The daily variation of human mobility is hard to model and predict; 2) the transportation network enforces a complex constraints on human mobility; 3) generating a rational fine-grained human trajectory is challenging for existing machine learning models; and 4) making a fine-grained prediction incurs high computational costs, which is challenging for an online system. Bearing these difficulties in mind, in this paper we propose a two-stage human mobility predictor that stratifies the coarse and fine-grained level predictions. In the first stage, to encode the daily variation of human mobility at a metropolitan level, we automatically extract citywide mobility trends as crowd contexts and predict long-term and long-distance movements at a coarse level. In the second stage, the coarse predictions are resolved to a fine-grained level via a probabilistic trajectory retrieval method, which offloads most of the heavy computations to the offline phase. We tested our method using a real-world mobile phone GPS dataset in the Kanto area in Japan, and achieved good prediction accuracy and a time efficiency of about 2 min in predicting future 1h movements of about 220K mobile phone users on a single machine to support more higher-level analysis of mobility prediction. △ Less

Submitted 21 June, 2022; originally announced July 2022.

arXiv:2207.00838 [pdf, other]

GOF-TTE: Generative Online Federated Learning Framework for Travel Time Estimation

Authors: Zhiwen Zhang, Hongjun Wang, Jiyuan Chen, Zipei Fan, Xuan Song, Ryosuke Shibasaki

Abstract: Estimating the travel time of a path is an essential topic for intelligent transportation systems. It serves as the foundation for real-world applications, such as traffic monitoring, route planning, and taxi dispatching. However, building a model for such a data-driven task requires a large amount of users' travel information, which directly relates to their privacy and thus is less likely to be… ▽ More Estimating the travel time of a path is an essential topic for intelligent transportation systems. It serves as the foundation for real-world applications, such as traffic monitoring, route planning, and taxi dispatching. However, building a model for such a data-driven task requires a large amount of users' travel information, which directly relates to their privacy and thus is less likely to be shared. The non-Independent and Identically Distributed (non-IID) trajectory data across data owners also make a predictive model extremely challenging to be personalized if we directly apply federated learning. Finally, previous work on travel time estimation does not consider the real-time traffic state of roads, which we argue can significantly influence the prediction. To address the above challenges, we introduce GOF-TTE for the mobile user group, Generative Online Federated Learning Framework for Travel Time Estimation, which I) utilizes the federated learning approach, allowing private data to be kept on client devices while training, and designs the global model as an online generative model shared by all clients to infer the real-time road traffic state. II) apart from sharing a base model at the server, adapts a fine-tuned personalized model for every client to study their personal driving habits, making up for the residual error made by localized global model prediction. % III) designs the global model as an online generative model shared by all clients to infer the real-time road traffic state. We also employ a simple privacy attack to our framework and implement the differential privacy mechanism to further guarantee privacy safety. Finally, we conduct experiments on two real-world public taxi datasets of DiDi Chengdu and Xi'an. The experimental results demonstrate the effectiveness of our proposed framework. △ Less

Submitted 2 July, 2022; originally announced July 2022.

Comments: 16 pages

arXiv:2206.10418 [pdf, other]

Route to Time and Time to Route: Travel Time Estimation from Sparse Trajectories

Authors: Zhiwen Zhang, Hongjun Wang, Zipei Fan, Jiyuan Chen, Xuan Song, Ryosuke Shibasaki

Abstract: Due to the rapid development of Internet of Things (IoT) technologies, many online web apps (e.g., Google Map and Uber) estimate the travel time of trajectory data collected by mobile devices. However, in reality, complex factors, such as network communication and energy constraints, make multiple trajectories collected at a low sampling rate. In this case, this paper aims to resolve the problem o… ▽ More Due to the rapid development of Internet of Things (IoT) technologies, many online web apps (e.g., Google Map and Uber) estimate the travel time of trajectory data collected by mobile devices. However, in reality, complex factors, such as network communication and energy constraints, make multiple trajectories collected at a low sampling rate. In this case, this paper aims to resolve the problem of travel time estimation (TTE) and route recovery in sparse scenarios, which often leads to the uncertain label of travel time and route between continuously sampled GPS points. We formulate this problem as an inexact supervision problem in which the training data has coarsely grained labels and jointly solve the tasks of TTE and route recovery. And we argue that both two tasks are complementary to each other in the model-learning procedure and hold such a relation: more precise travel time can lead to better inference for routes, in turn, resulting in a more accurate time estimation). Based on this assumption, we propose an EM algorithm to alternatively estimate the travel time of inferred route through weak supervision in E step and retrieve the route based on estimated travel time in M step for sparse trajectories. We conducted experiments on three real-world trajectory datasets and demonstrated the effectiveness of the proposed method. △ Less

Submitted 21 June, 2022; originally announced June 2022.

arXiv:2204.05184 [pdf, other]

doi 10.1109/JIOT.2023.3262740

Domain Adversarial Graph Convolutional Network Based on RSSI and Crowdsensing for Indoor Localization

Authors: Mingxin Zhang, Zipei Fan, Ryosuke Shibasaki, Xuan Song

Abstract: In recent years, the use of WiFi fingerprints for indoor positioning has grown in popularity, largely due to the widespread availability of WiFi and the proliferation of mobile communication devices. However, many existing methods for constructing fingerprint datasets rely on labor-intensive and time-consuming processes of collecting large amounts of data. Additionally, these methods often focus o… ▽ More In recent years, the use of WiFi fingerprints for indoor positioning has grown in popularity, largely due to the widespread availability of WiFi and the proliferation of mobile communication devices. However, many existing methods for constructing fingerprint datasets rely on labor-intensive and time-consuming processes of collecting large amounts of data. Additionally, these methods often focus on ideal laboratory environments, rather than considering the practical challenges of large multi-floor buildings. To address these issues, we present a novel WiDAGCN model that can be trained using a small number of labeled site survey data and large amounts of unlabeled crowdsensed WiFi fingerprints. By constructing heterogeneous graphs based on received signal strength indicators (RSSIs) between waypoints and WiFi access points (APs), our model is able to effectively capture the topological structure of the data. We also incorporate graph convolutional networks (GCNs) to extract graph-level embeddings, a feature that has been largely overlooked in previous WiFi indoor localization studies. To deal with the challenges of large amounts of unlabeled data and multiple data domains, we employ a semi-supervised domain adversarial training scheme to effectively utilize unlabeled data and align the data distributions across domains. Our system is evaluated using a public indoor localization dataset that includes multiple buildings, and the results show that it performs competitively in terms of localization accuracy in large buildings. △ Less

Submitted 31 March, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: IEEE Internet of Things Journal

Journal ref: IEEE Internet of Things Journal, vol. 10, no. 15, pp. 13662-13672, 2023

arXiv:2112.08443 [pdf, other]

Event-Aware Multimodal Mobility Nowcasting

Authors: Zhaonan Wang, Renhe Jiang, Hao Xue, Flora D. Salim, Xuan Song, Ryosuke Shibasaki

Abstract: As a decisive part in the success of Mobility-as-a-Service (MaaS), spatio-temporal predictive modeling for crowd movements is a challenging task particularly considering scenarios where societal events drive mobility behavior deviated from the normality. While tremendous progress has been made to model high-level spatio-temporal regularities with deep learning, most, if not all of the existing met… ▽ More As a decisive part in the success of Mobility-as-a-Service (MaaS), spatio-temporal predictive modeling for crowd movements is a challenging task particularly considering scenarios where societal events drive mobility behavior deviated from the normality. While tremendous progress has been made to model high-level spatio-temporal regularities with deep learning, most, if not all of the existing methods are neither aware of the dynamic interactions among multiple transport modes nor adaptive to unprecedented volatility brought by potential societal events. In this paper, we are therefore motivated to improve the canonical spatio-temporal network (ST-Net) from two perspectives: (1) design a heterogeneous mobility information network (HMIN) to explicitly represent intermodality in multimodal mobility; (2) propose a memory-augmented dynamic filter generator (MDFG) to generate sequence-specific parameters in an on-the-fly fashion for various scenarios. The enhanced event-aware spatio-temporal network, namely EAST-Net, is evaluated on several real-world datasets with a wide variety and coverage of societal events. Both quantitative and qualitative experimental results verify the superiority of our approach compared with the state-of-the-art baselines. Code and data are published on https://github.com/underdoc-wang/EAST-Net. △ Less

Submitted 14 December, 2021; originally announced December 2021.

Comments: Accepted by AAAI 2022

arXiv:2111.10785 [pdf, other]

Differentiable Projection for Constrained Deep Learning

Authors: Dou Huang, Haoran Zhang, Xuan Song, Ryosuke Shibasaki

Abstract: Deep neural networks (DNNs) have achieved extraordinary performance in solving different tasks in various fields. However, the conventional DNN model is steadily approaching the ground-truth value through loss backpropagation. In some applications, some prior knowledge could be easily obtained, such as constraints which the ground truth observation follows. Here, we try to give a general approach… ▽ More Deep neural networks (DNNs) have achieved extraordinary performance in solving different tasks in various fields. However, the conventional DNN model is steadily approaching the ground-truth value through loss backpropagation. In some applications, some prior knowledge could be easily obtained, such as constraints which the ground truth observation follows. Here, we try to give a general approach to incorporate information from these constraints to enhance the performance of the DNNs. Theoretically, we could formulate these kinds of problems as constrained optimization problems that KKT conditions could solve. In this paper, we propose to use a differentiable projection layer in DNN instead of directly solving time-consuming KKT conditions. The proposed projection method is differentiable, and no heavy computation is required. Finally, we also conducted some experiments using a randomly generated synthetic dataset and image segmentation task using the PASCAL VOC dataset to evaluate the performance of the proposed projection method. Experimental results show that the projection method is sufficient and outperforms baseline methods. △ Less

Submitted 21 November, 2021; originally announced November 2021.

arXiv:2109.08527 [pdf]

An open GPS trajectory dataset and benchmark for travel mode detection

Authors: Jinyu Chen, Haoran Zhang, Xuan Song, Ryosuke Shibasaki

Abstract: Travel mode detection has been a hot topic in the field of GPS trajectory-related processing. Former scholars have developed many mathematical methods to improve the accuracy of detection. Among these studies, almost all of the methods require ground truth dataset for training. A large amount of the studies choose to collect the GPS trajectory dataset for training by their customized ways. Current… ▽ More Travel mode detection has been a hot topic in the field of GPS trajectory-related processing. Former scholars have developed many mathematical methods to improve the accuracy of detection. Among these studies, almost all of the methods require ground truth dataset for training. A large amount of the studies choose to collect the GPS trajectory dataset for training by their customized ways. Currently, there is no open GPS dataset marked with travel mode. If there exists one, it will not only save a lot of efforts in model developing, but also help compare the performance of models. In this study, we propose and open GPS trajectory dataset marked with travel mode and benchmark for the travel mode detection. The dataset is collected by 7 independent volunteers in Japan and covers the time period of a complete month. The travel mode ranges from walking to railway. A part of routines are traveled repeatedly in different time slots to experience different road and travel conditions. We also provide a case study to distinguish the walking and bike trips in a massive GPS trajectory dataset. △ Less

Submitted 28 September, 2021; v1 submitted 17 September, 2021; originally announced September 2021.

arXiv:2108.09091 [pdf, other]

doi 10.1145/3459637.3482000

DL-Traff: Survey and Benchmark of Deep Learning Models for Urban Traffic Prediction

Authors: Renhe Jiang, Du Yin, Zhaonan Wang, Yizhuo Wang, Jiewen Deng, Hangchen Liu, Zekun Cai, Jinliang Deng, Xuan Song, Ryosuke Shibasaki

Abstract: Nowadays, with the rapid development of IoT (Internet of Things) and CPS (Cyber-Physical Systems) technologies, big spatiotemporal data are being generated from mobile phones, car navigation systems, and traffic sensors. By leveraging state-of-the-art deep learning technologies on such data, urban traffic prediction has drawn a lot of attention in AI and Intelligent Transportation System community… ▽ More Nowadays, with the rapid development of IoT (Internet of Things) and CPS (Cyber-Physical Systems) technologies, big spatiotemporal data are being generated from mobile phones, car navigation systems, and traffic sensors. By leveraging state-of-the-art deep learning technologies on such data, urban traffic prediction has drawn a lot of attention in AI and Intelligent Transportation System community. The problem can be uniformly modeled with a 3D tensor (T, N, C), where T denotes the total time steps, N denotes the size of the spatial domain (i.e., mesh-grids or graph-nodes), and C denotes the channels of information. According to the specific modeling strategy, the state-of-the-art deep learning models can be divided into three categories: grid-based, graph-based, and multivariate time-series models. In this study, we first synthetically review the deep traffic models as well as the widely used datasets, then build a standard benchmark to comprehensively evaluate their performances with the same settings and metrics. Our study named DL-Traff is implemented with two most popular deep learning frameworks, i.e., TensorFlow and PyTorch, which is already publicly available as two GitHub repositories https://github.com/deepkashiwa20/DL-Traff-Grid and https://github.com/deepkashiwa20/DL-Traff-Graph. With DL-Traff, we hope to deliver a useful resource to researchers who are interested in spatiotemporal data analysis. △ Less

Submitted 20 August, 2021; originally announced August 2021.

Comments: This paper has been accepted by CIKM 2021 Resource Track

arXiv:2107.02411 [pdf]

Adapting Vehicle Detector to Target Domain by Adversarial Prediction Alignment

Authors: Yohei Koga, Hiroyuki Miyazaki, Ryosuke Shibasaki

Abstract: While recent advancement of domain adaptation techniques is significant, most of methods only align a feature extractor and do not adapt a classifier to target domain, which would be a cause of performance degradation. We propose novel domain adaptation technique for object detection that aligns prediction output space. In addition to feature alignment, we aligned predictions of locations and clas… ▽ More While recent advancement of domain adaptation techniques is significant, most of methods only align a feature extractor and do not adapt a classifier to target domain, which would be a cause of performance degradation. We propose novel domain adaptation technique for object detection that aligns prediction output space. In addition to feature alignment, we aligned predictions of locations and class confidences of our vehicle detector for satellite images by adversarial training. The proposed method significantly improved AP score by over 5%, which shows effectivity of our method for object detection tasks in satellite images. △ Less

Submitted 6 July, 2021; originally announced July 2021.

Comments: The accepted version of the article in IEEE International Geoscience and Remote Sensing Symposium (IGARSS) 2021. Copyright 2021 IEEE. Code available: https://github.com/monotaro3/vd_pred_align

arXiv:2104.13139 [pdf]

Mobsimilarity: Vector Graph Optimization for Mobility Tableau Comparison

Authors: Yuhao Yao, Haoran Zhang, Jinyu Chen, Wenjing Li, Mariko Shibasaki, Ryosuke Shibasaki, Xuan Song

Abstract: Human mobility similarity comparison plays a critical role in mobility estimation/prediction model evaluation, mobility clustering and mobility matching, which exerts an enormous impact on improving urban mobility, accessibility, and reliability. By expanding origin-destination matrix, we propose a concept named mobility tableau, which is an aggregated tableau representation to the population flow… ▽ More Human mobility similarity comparison plays a critical role in mobility estimation/prediction model evaluation, mobility clustering and mobility matching, which exerts an enormous impact on improving urban mobility, accessibility, and reliability. By expanding origin-destination matrix, we propose a concept named mobility tableau, which is an aggregated tableau representation to the population flow distributed between different location pairs of a study site and can be represented by a vector graph. Compared with traditional OD matrix-based mobility comparison, mobility tableau comparison provides high-dimensional similarity information, including volume similarity, spatial similarity, mass inclusiveness and structure similarity. A novel mobility tableaus similarity measurement method is proposed by optimizing the least spatial cost of transforming the vector graph for one mobility tableau into the other and is optimized to be efficient. The robustness of the measure is supported through several sensitive analysis on GPS based mobility tableau. The better performance of the approach compared with traditional mobility comparison methods in two case studies demonstrate the practicality and superiority, while one study is estimated mobility tableaus validation and the other is different cities' mobility tableaus comparison. △ Less

Submitted 29 September, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

arXiv:2104.11968 [pdf]

Effective Metagraph-based Life Pattern Clustering with Big Human Mobility Data

Authors: Wenjing Li, Haoran Zhang, Jinyu Chen, Peiran Li, Yuhao Yao, Mariko Shibasaki, Xuan Song, Ryosuke Shibasaki

Abstract: Life pattern clustering is essential for abstracting the groups' characteristics of daily mobility patterns and activity regularity. Based on millions of GPS records, this paper proposed a framework on the life pattern clustering which can efficiently identify the groups have similar life pattern. The proposed method can retain original features of individual life pattern data without aggregation.… ▽ More Life pattern clustering is essential for abstracting the groups' characteristics of daily mobility patterns and activity regularity. Based on millions of GPS records, this paper proposed a framework on the life pattern clustering which can efficiently identify the groups have similar life pattern. The proposed method can retain original features of individual life pattern data without aggregation. Metagraph-based data structure is proposed for presenting the diverse life pattern. Spatial-temporal similarity includes significant places semantics, time sequential properties and frequency are integrated into this data structure, which captures the uncertainty of an individual and the diversities between individuals. Non-negative-factorization-based method was utilized for reducing the dimension. The results show that our proposed method can effectively identify the groups have similar life pattern and takes advantages in computation efficiency and robustness comparing with the traditional method. We revealed the representative life pattern groups and analyzed the group characteristics of human life patterns during different periods and different regions. We believe our work will help in future infrastructure planning, services improvement and policies making related to urban and transportation, thus promoting a humanized and sustainable city. △ Less

Submitted 24 April, 2021; originally announced April 2021.

arXiv:2011.12847 [pdf, other]

Deep-learning coupled with novel classification method to classify the urban environment of the developing world

Authors: Qianwei Cheng, AKM Mahbubur Rahman, Anis Sarker, Abu Bakar Siddik Nayem, Ovi Paul, Amin Ahsan Ali, M Ashraful Amin, Ryosuke Shibasaki, Moinul Zaber

Abstract: Rapid globalization and the interdependence of humanity that engender tremendous in-flow of human migration towards the urban spaces. With advent of high definition satellite images, high resolution data, computational methods such as deep neural network, capable hardware; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high… ▽ More Rapid globalization and the interdependence of humanity that engender tremendous in-flow of human migration towards the urban spaces. With advent of high definition satellite images, high resolution data, computational methods such as deep neural network, capable hardware; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high-frequency data. In this paper we propose a novel classification method that is readily usable for machine analysis and show applicability of the methodology on a developing world setting. The state-of-the-art is mostly dominated by classification of building structures, building types etc. and largely represents the developed world which are insufficient for developing countries such as Bangladesh where the surrounding is crucial for the classification. Moreover, the traditional methods propose small-scale classifications, which give limited information with poor scalability and are slow to compute. We categorize the urban area in terms of informal and formal spaces taking the surroundings into account. 50 km x 50 km Google Earth image of Dhaka, Bangladesh was visually annotated and categorized by an expert. The classification is based broadly on two dimensions: urbanization and the architectural form of urban environment. Consequently, the urban space is divided into four classes: 1) highly informal; 2) moderately informal; 3) moderately formal; and 4) highly formal areas. In total 16 sub-classes were identified. For semantic segmentation, Google's DeeplabV3+ model was used which increases the field of view of the filters to incorporate larger context. Image encompassing 70% of the urban space was used for training and the remaining 30% was used for testing and validation. The model is able to segment with 75% accuracy and 60% Mean IoU. △ Less

Submitted 7 January, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

Comments: Accepted paper at 2nd International Conference on Signal Processing and Machine Learning (SIGML 2021); 20 pages, 7 figures, 1 table

arXiv:2007.03180 [pdf, other]

doi 10.1109/TVCG.2022.3165385

EpiMob: Interactive Visual Analytics of Citywide Human Mobility Restrictions for Epidemic Control

Authors: Chuang Yang, Zhiwen Zhang, Zipei Fan, Renhe Jiang, Quanjun Chen, Xuan Song, Ryosuke Shibasaki

Abstract: The outbreak of coronavirus disease (COVID-19) has swept across more than 180 countries and territories since late January 2020. As a worldwide emergency response, governments have implemented various measures and policies, such as self-quarantine, travel restrictions, work from home, and regional lockdown, to control the spread of the epidemic. These countermeasures seek to restrict human mobilit… ▽ More The outbreak of coronavirus disease (COVID-19) has swept across more than 180 countries and territories since late January 2020. As a worldwide emergency response, governments have implemented various measures and policies, such as self-quarantine, travel restrictions, work from home, and regional lockdown, to control the spread of the epidemic. These countermeasures seek to restrict human mobility because COVID-19 is a highly contagious disease that is spread by human-to-human transmission. Medical experts and policymakers have expressed the urgency to effectively evaluate the outcome of human restriction policies with the aid of big data and information technology. Thus, based on big human mobility data and city POI data, an interactive visual analytics system called Epidemic Mobility (EpiMob) was designed in this study. The system interactively simulates the changes in human mobility and infection status in response to the implementation of a certain restriction policy or a combination of policies (e.g., regional lockdown, telecommuting, screening). Users can conveniently designate the spatial and temporal ranges for different mobility restriction policies. Then, the results reflecting the infection situation under different policies are dynamically displayed and can be flexibly compared and analyzed in depth. Multiple case studies consisting of interviews with domain experts were conducted in the largest metropolitan area of Japan (i.e., Greater Tokyo Area) to demonstrate that the system can provide insight into the effects of different human mobility restriction policies for epidemic control, through measurements and comparisons. △ Less

Submitted 29 November, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

Journal ref: IEEE Transactions on Visualization and Computer Graphics, 2022

arXiv:1911.06982 [pdf, other]

VLUC: An Empirical Benchmark for Video-Like Urban Computing on Citywide Crowd and Traffic Prediction

Authors: Renhe Jiang, Zekun Cai, Zhaonan Wang, Chuang Yang, Zipei Fan, Xuan Song, Kota Tsubouchi, Ryosuke Shibasaki

Abstract: Nowadays, massive urban human mobility data are being generated from mobile phones, car navigation systems, and traffic sensors. Predicting the density and flow of the crowd or traffic at a citywide level becomes possible by using the big data and cutting-edge AI technologies. It has been a very significant research topic with high social impact, which can be widely applied to emergency management… ▽ More Nowadays, massive urban human mobility data are being generated from mobile phones, car navigation systems, and traffic sensors. Predicting the density and flow of the crowd or traffic at a citywide level becomes possible by using the big data and cutting-edge AI technologies. It has been a very significant research topic with high social impact, which can be widely applied to emergency management, traffic regulation, and urban planning. In particular, by meshing a large urban area to a number of fine-grained mesh-grids, citywide crowd and traffic information in a continuous time period can be represented like a video, where each timestamp can be seen as one video frame. Based on this idea, a series of methods have been proposed to address video-like prediction for citywide crowd and traffic. In this study, we publish a new aggregated human mobility dataset generated from a real-world smartphone application and build a standard benchmark for such kind of video-like urban computing with this new dataset and the existing open datasets. We first comprehensively review the state-of-the-art works of literature and formulate the density and in-out flow prediction problem, then conduct a thorough performance assessment for those methods. With this benchmark, we hope researchers can easily follow up and quickly launch a new solution on this topic. △ Less

Submitted 16 November, 2019; originally announced November 2019.

arXiv:1809.10862 [pdf]

Semantic Segmentation for Urban Planning Maps based on U-Net

Authors: Zhiling Guo, Hiroaki Shengoku, Guangming Wu, Qi Chen, Wei Yuan, Xiaodan Shi, Xiaowei Shao, Yongwei Xu, Ryosuke Shibasaki

Abstract: The automatic digitizing of paper maps is a significant and challenging task for both academia and industry. As an important procedure of map digitizing, the semantic segmentation section mainly relies on manual visual interpretation with low efficiency. In this study, we select urban planning maps as a representative sample and investigate the feasibility of utilizing U-shape fully convolutional… ▽ More The automatic digitizing of paper maps is a significant and challenging task for both academia and industry. As an important procedure of map digitizing, the semantic segmentation section mainly relies on manual visual interpretation with low efficiency. In this study, we select urban planning maps as a representative sample and investigate the feasibility of utilizing U-shape fully convolutional based architecture to perform end-to-end map semantic segmentation. The experimental results obtained from the test area in Shibuya district, Tokyo, demonstrate that our proposed method could achieve a very high Jaccard similarity coefficient of 93.63% and an overall accuracy of 99.36%. For implementation on GPGPU and cuDNN, the required processing time for the whole Shibuya district can be less than three minutes. The results indicate the proposed method can serve as a viable tool for urban planning map semantic segmentation task with high accuracy and efficiency. △ Less

Submitted 30 September, 2018; v1 submitted 28 September, 2018; originally announced September 2018.

Comments: 4 pages, 3 figures, conference, International Geoscience and Remote Sensing Symposium (IGARSS 2018), Jul 2018, Valencia, Spain

arXiv:1708.03921 [pdf, other]

Visual Graph Mining

Authors: Quanshi Zhang, Xuan Song, Ryosuke Shibasaki

Abstract: In this study, we formulate the concept of "mining maximal-size frequent subgraphs" in the challenging domain of visual data (images and videos). In general, visual knowledge can usually be modeled as attributed relational graphs (ARGs) with local attributes representing local parts and pairwise attributes describing the spatial relationship between parts. Thus, from a practical perspective, such… ▽ More In this study, we formulate the concept of "mining maximal-size frequent subgraphs" in the challenging domain of visual data (images and videos). In general, visual knowledge can usually be modeled as attributed relational graphs (ARGs) with local attributes representing local parts and pairwise attributes describing the spatial relationship between parts. Thus, from a practical perspective, such mining of maximal-size subgraphs can be regarded as a general platform for discovering and modeling the common objects within cluttered and unlabeled visual data. Then, from a theoretical perspective, visual graph mining should encode and overcome the great fuzziness of messy data collected from complex real-world situations, which conflicts with the conventional theoretical basis of graph mining designed for tabular data. Common subgraphs hidden in these ARGs usually have soft attributes, with considerable inter-graph variation. More importantly, we should also discover the latent pattern space, including similarity metrics for the pattern and hidden node relations, during the mining process. In this study, we redefine the visual subgraph pattern that encodes all of these challenges in a general way, and propose an approximate but efficient solution to graph mining. We conduct five experiments to evaluate our method with different kinds of visual data, including videos and RGB/RGB-D images. These experiments demonstrate the generality of the proposed method. △ Less

Submitted 13 August, 2017; originally announced August 2017.

Showing 1–23 of 23 results for author: Shibasaki, R