Abstract
Pre-training the embedding of a location generated from human mobility data has become a popular method for location based services. In practice, modeling the location embedding is too expensive, due to the large number of locations to be trained in situations with fine-grained resolution or extensive target regions. Previous studies have handled less than ten thousand distinct locations, which is insufficient in the real-world applications. To tackle this problem, we propose a Geo-Tokenizer, designed to efficiently reduce the number of locations to be trained by representing a location as a combination of several grids at different scales. In the Geo-Tokenizer, a grid at a larger scale shares the common set of grids at smaller scales, which is a key factor in reducing the size of the location vocabulary. The sequences of locations preprocessed with the Geo-Tokenizer are utilized by a causal location embedding model to capture the temporal dependencies of locations. This model dynamically calculates the embedding vector of a target location, which varies depending on its trajectory. In addition, to efficiently pre-train the location embedding model, we propose the Hierarchical Auto-regressive Location Model objective to effectively train decomposed locations in the Geo-Tokenizer. We conducted experiments on two real-world user trajectory datasets using our pre-trained location model. The experimental results show that our model significantly improves the performance of downstream tasks with fewer model parameters compared to existing location embedding methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aksoy, Ç., Ahmetoğlu, A., Güngör, T.: Hierarchical multitask learning approach for BERT. arXiv preprint arXiv:2011.04451 (2020)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186 (2019)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Li, Y., Du, N., Bengio, S.: Time-dependent representation for neural event sequence prediction. arXiv preprint arXiv:1708.00065 (2017)
Liang, Y., et al.: Trajformer: efficient trajectory classification with transformers. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 1229–1237 (2022)
Lin, Y., Wan, H., Guo, S., Lin, Y.: Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence (2020)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Park, S., Lee, S., Woo, S.S.: BERTloc: duplicate location record detection in a large-scale location dataset. In: Proceedings of the 36th Annual ACM Symposium on Applied Computing, pp. 942–951 (2021)
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI blog 1(8), 9 (2019)
Shimizu, T., Yabe, T., Tsubouchi, K.: Learning fine grained place embeddings with spatial hierarchy from human mobility trajectories. arXiv preprint arXiv:2002.02058 (2020)
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
Wan, H., Li, F., Guo, S., Cao, Z., Lin, Y.: Learning time-aware distributed representations of locations from spatio-temporal trajectories. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11448, pp. 268–272. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18590-9_26
Wan, H., Lin, Y., Guo, S., Lin, Y.: Pre-training time-aware location embeddings from spatial-temporal trajectories. IEEE Trans. Knowl. Data Eng. (2021)
Yao, D., Zhang, C., Huang, J., Bi, J.: SERM: a recurrent model for next location prediction in semantic trajectories. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 2411–2414 (2017)
Yao, Z., Fu, Y., Liu, B., Hu, W., Xiong, H.: Representing urban functions through zone embedding with human mobility patterns. In: IJCAI, pp. 3919–3925 (2018)
Zhao, P., et al.: Where to go next: a spatio-temporal gated network for next poi recommendation. IEEE Trans. Knowl. Data Eng. 34, 2512–2524 (2020)
Zhao, S., Zhao, T., King, I., Lyu, M.R.: Geo-teaser: geo-temporal sequential embedding rank for point-of-interest recommendation. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 153–162 (2017)
Zheng, Y., Xie, X., Ma, W.Y., et al.: GeoLife: a collaborative social networking service among user, location and trajectory. IEEE Data Eng. Bull. 33(2), 32–39 (2010)
Zhou, F., Gao, Q., Trajcevski, G., Zhang, K., Zhong, T., Zhang, F.: Trajectory-user linking via variational autoencoder. In: IJCAI, pp. 3212–3218 (2018)
Zhou, F., Yue, X., Trajcevski, G., Zhong, T., Zhang, K.: Context-aware variational trajectory encoding and human mobility inference. In: The World Wide Web Conference, pp. 3469–3475 (2019)
Zhou, Y., Huang, Y.: DeepMove: learning place representations through large scale movement data. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 2403–2412. IEEE (2018)
Acknowledgment
This work was supported by the institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2019-0-00075, Artificial Intelligence Graduate School Program (KAIST)) and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2022R1A2B5B0 2001913). The authors would like to thank the AI Service Business Division of SK Telecom for providing GPU cluster support to conduct massive experiments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethical Statement
Ethical Statement
There are no ethical issues.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Park, C., Kim, T., Hong, J., Choi, M., Choo, J. (2023). Pre-training Contextual Location Embeddings in Personal Trajectories via Efficient Hierarchical Location Representations. In: De Francisci Morales, G., Perlich, C., Ruchansky, N., Kourtellis, N., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14175. Springer, Cham. https://doi.org/10.1007/978-3-031-43430-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-43430-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43429-7
Online ISBN: 978-3-031-43430-3
eBook Packages: Computer ScienceComputer Science (R0)