Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Geosocial Co-Clustering: A Novel Framework for Geosocial Community Detection

Published: 13 June 2020 Publication History

Abstract

As location-based services using mobile devices have become globally popular these days, social network analysis (especially, community detection) increasingly benefits from combining social relationships with geographic preferences. In this regard, this article addresses the emerging problem of geosocial community detection. We first formalize the problem of geosocial co-clustering, which co-clusters the users in social networks and the locations they visited. Geosocial co-clustering detects higher-quality communities than existing approaches by improving the mapping clusterability, whereby users in the same community tend to visit locations in the same region. While geosocial co-clustering is soundly formalized as non-negative matrix tri-factorization, conventional matrix tri-factorization algorithms suffer from a significant computational overhead when handling large-scale datasets. Thus, we also develop an efficient framework for geosocial co-clustering, called GEOsocial COarsening and DEcomposition (GEOCODE). To achieve efficient matrix tri-factorization, GEOCODE reduces the numbers of users and locations through coarsening and then decomposes the single whole matrix tri-factorization into a set of multiple smaller sub-matrix tri-factorizations. Thorough experiments conducted using real-world geosocial networks show that GEOCODE reduces the elapsed time by 19–69 times while achieving the accuracy of up to 94.8% compared with the state-of-the-art co-clustering algorithm. Furthermore, the benefit of the mapping clusterability is clearly demonstrated through a local expert recommendation application.

References

[1]
Lada A. Adamic, Jun Zhang, Eytan Bakshy, and Mark S. Ackerman. 2008. Knowledge sharing and Yahoo answers: Everyone knows something. In Proceedings of the 17th International Conference on World Wide Web. 665--674.
[2]
Waseem Ahmad and Ashfaq Khokhar. 2007. cHawk: An efficient biclustering algorithm based on bipartite graph crossing minimization. In Proceedings of the VLDB Workshop on Data Mining in Bioinformatics. 1553--1558.
[3]
Nikos Armenatzoglou, Stavros Papadopoulos, and Dimitris Papadias. 2013. A general framework for geo-social query processing. Proc. VLDB Endow. 6, 10 (2013), 913--924.
[4]
Lars Backstrom, Eric Sun, and Cameron Marlow. 2010. Find me if you can: Improving geographical prediction with social and spatial proximity. In Proceedings of the 19th International Conference on World Wide Web. 61--70.
[5]
Jie Bao, Yu Zheng, and Mohamed F. Mokbel. 2012. Location-based and preference-aware recommendation using sparse geo-social networking data. In Proceedings of the 20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 199--208.
[6]
Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 8 (2011), 1548--1560.
[7]
Fazli Can and Esen A. Ozkarahan. 1990. Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases. ACM Trans. Database Syst. 15, 4 (1990), 483--517.
[8]
Carlos Castro-Herrera, Chuan Duan, Jane Cleland-Huang, and Bamshad Mobasher. 2009. A recommender system for requirements elicitation in large-scale software projects. In Proceedings of the 2009 ACM Symposium on Applied Computing. 1419--1426.
[9]
Zhiyuan Cheng, James Caverlee, Himanshu Barthwal, and Vandana Bachani. 2014. Who is the Barbecue King of Texas?: A geo-spatial approach to finding local experts on Twitter. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. 335--344.
[10]
Eunjoon Cho, Seth A. Myers, and Jure Leskovec. 2011. Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1082--1090.
[11]
Minsoo Choy, Jae-Gil Lee, Gahgene Gweon, and Daehoon Kim. 2014. Glaucus: Exploiting the wisdom of crowds for location-based queries in mobile environments. In Proceedings of the 8th AAAI International Conference on Weblogs and Social Media. 61--70.
[12]
Inderjit S. Dhillon, Subramanyam Mallela, and Dharmendra S. Modha. 2003. Information-theoretic co-clustering. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 89--98.
[13]
Chris Ding, Xiaofeng He, and Horst D. Simon. 2005. On the equivalence of nonnegative matrix factorization and spectral clustering. In Proceedings of the 2005 SIAM International Conference on Data Mining. 606--610.
[14]
Chris Ding, Tao Li, Wei Peng, and Haesun Park. 2006. Orthogonal nonnegative matrix tri-factorizations for clustering. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 126--135.
[15]
Peter D. Grünwald, In Jae Myung, and Mark A Pitt. 2005. Advances in Minimum Description Length: Theory and Applications. MIT Press.
[16]
Quanquan Gu and Jie Zhou. 2009. Co-clustering on manifolds. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 359--368.
[17]
Negar Hariri, Carlos Castro-Herrera, Mehdi Mirakhorli, Jane Cleland-Huang, and Bamshad Mobasher. 2013. Supporting domain analysis through mining and recommending features from online product listings. IEEE Trans. Softw. Eng. 39, 12 (2013), 1736--1752.
[18]
Xiaofei He and Partha Niyogi. 2004. Locality preserving projections. In Advances in Neural Information Processing Systems. 153--160.
[19]
Andrew E. G. Jonas. 2012. Region and place: Regionalism in question. Progr. Hum. Geogr. 36, 2 (2012), 263--272.
[20]
George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 1 (1998), 359--392.
[21]
Jungeun Kim, Minseo Kang, Sungsu Lim, and Jae-Gil Lee. 2015. Triangle counting in networks using a multi-level branching technique. In Proceedings of the 2015 International Conference on Big Data and Smart Computing. 47--50.
[22]
Jungeun Kim and Jae-Gil Lee. 2015. Community detection in multi-layer graphs: A survey. ACM SIGMOD Rec. 44, 3 (2015), 37--48.
[23]
Jungeun Kim, Jae-Gil Lee, and Sungsu Lim. 2016. Differential flattening: A novel framework for community detection in multi-layer graphs. ACM Trans. Intell. Syst. Technol. 8, 2 (2016), 27.
[24]
Jungeun Kim, Sungsu Lim, Jae-Gil Lee, and Byung Lee. 2018. LinkBlackHole*: Robust overlapping community detection using link embedding. IEEE Trans. Knowl. Data Eng. 31, 11 (2018), 2138--2150.
[25]
Jae-Gil Lee and Minseo Kang. 2015. Geospatial big data: Challenges and opportunities. Big Data Res. 2, 2 (2015), 74--81.
[26]
Thomas Lee. 2001. An introduction to coding theory and the two-part minimum description length principle. Int. Stat. Rev. 69, 2 (2001), 169--183.
[27]
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. Retrieved from http://snap.stanford.edu/data.
[28]
Jure Leskovec, Kevin J. Lang, and Michael Mahoney. 2010. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th International Conference on World Wide Web. 631--640.
[29]
Kenneth Wai-Ting Leung, Dik Lun Lee, and Wang-Chien Lee. 2011. CLR: A collaborative location recommendation framework based on co-clustering. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. 305--314.
[30]
Yafei Li, Rui Chen, Jianliang Xu, Qiao Huang, Haibo Hu, and Byron Choi. 2015. Geo-social K-cover group queries for collaborative spatial computing. IEEE Trans. Knowl. Data Eng. 27, 10 (2015), 2729--2742.
[31]
David Liben-Nowell and Jon M. Kleinberg. 2007. The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58, 7 (2007), 1019--1031.
[32]
Richard M. Medina and George F. Hepner. 2011. Advancing the understanding of sociospatial dependencies in terrorist networks. Trans. GIS 15, 5 (2011), 577--597.
[33]
Kyriakos Mouratidis, Jing Li, Yu Tang, and Nikos Mamoulis. 2015. Joint search by social and spatial proximity. IEEE Trans. Knowl. Data Eng. 27, 3 (2015), 781--793.
[34]
Valerio Perrone, Paul A Jenkins, Dario Spano, and Yee Whye Teh. 2017. Poisson random fields for dynamic feature models. J. Mach. Learn. Res. 18, 1 (2017), 4626--4670.
[35]
Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (1987), 53--65.
[36]
Mohamed Sarwat, Justin J. Levandoski, Ahmed Eldawy, and Mohamed F. Mokbel. 2014. LARS*: An efficient and scalable location-aware recommender system. IEEE Trans. Knowl. Data Eng. 26, 6 (2014), 1384--1399.
[37]
Paulo Shakarian, Patrick Roos, Devon Callahan, and Cory Kirk. 2013. Mining for geographically disperse communities in social networks by leveraging distance modularity. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1402--1409.
[38]
Fanhua Shang, L. C. Jiao, and Fei Wang. 2012. Graph dual regularization non-negative matrix factorization for co-clustering. Pattern Recogn. 45, 6 (2012), 2237--2250.
[39]
Jieming Shi, Nikos Mamoulis, Dingming Wu, and David W. Cheung. 2014. Density-based place clustering in geo-social networks. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. 99--110.
[40]
Yves van Gennip, Blake Hunter, Raymond Ahn, Peter Elliott, Kyle Luh, Megan Halvorson, Shannon Reid, Matthew Valasik, James Wo, George E. Tita, Andrea L. Bertozzi, and P. Jeffrey Brantingham. 2013. Community detection using spectral clustering on sparse geosocial data. SIAM J. Appl. Math. 73, 1 (2013), 67--83.
[41]
Hua Wang, Feiping Nie, Heng Huang, and Chris Ding. 2011. Nonnegative matrix tri-factorization based high-order co-clustering and its fast implementation. In Proceedings of the 11th IEEE International Conference on Data Mining. 774--783.
[42]
Hao Wang, Manolis Terrovitis, and Nikos Mamoulis. 2013. Location recommendation in location-based social networks using user check-in data. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 364--374.
[43]
Meng Wang, Chaokun Wang, Jeffrey Xu Yu, and Jun Zhang. 2015. Community detection in social networks: An in-depth benchmarking study with a procedure-oriented framework. Proc. VLDB Endow. 8, 10 (2015), 998--1009.
[44]
Xiaoyang Wang, Ying Zhang, Wenjie Zhang, and Xuemin Lin. 2016. Distance-aware influence maximization in geo-social network. In Proceedings of the 32nd IEEE International Conference on Data Engineering. 1--12.
[45]
Yao Wu, Xudong Liu, Min Xie, Martin Ester, and Qing Yang. 2015. CCCF: Improving collaborative filtering via scalable user-item co-clustering. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. 73--82.
[46]
De-Nian Yang, Chih-Ya Shen, Wang-Chien Lee, and Ming-Syan Chen. 2012. On socio-spatial group query for location-based social networks. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 949--957.
[47]
Hongzhi Yin, Zhiting Hu, Xiaofang Zhou, Hao Wang, Kai Zheng, Nguyen Quoc Viet Hung, and Shazia Wasim Sadiq. 2016. Discovering interpretable geo-social communities for user behavior prediction. In Proceedings of the 32nd IEEE International Conference on Data Engineering. 942--953.
[48]
Jia-Dong Zhang and Chi-Yin Chow. 2013. iGSLR: Personalized geo-social location recommendation—A kernel density estimation approach. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 334--343.
[49]
Marinka Zitnik and Blaz Zupan. 2015. Data fusion by matrix factorization. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1 (2015), 41--53.

Cited By

View all
  • (2024)Effective Clustering on Large Attributed Bipartite GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671764(3782-3793)Online publication date: 25-Aug-2024
  • (2024)Context-aware Community Detection in the Russia-Ukraine Conflict NetworkProceedings of the 25th International Conference on Distributed Computing and Networking10.1145/3631461.3632089(328-333)Online publication date: 4-Jan-2024
  • (2023)Spatial-Aware Local Community Detection Guided by Dominance RelationIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.320997610:2(686-699)Online publication date: Apr-2023
  • Show More Cited By

Index Terms

  1. Geosocial Co-Clustering: A Novel Framework for Geosocial Community Detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 11, Issue 4
    Survey Paper and Regular Paper
    August 2020
    358 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/3401889
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 June 2020
    Online AM: 07 May 2020
    Accepted: 01 March 2020
    Received: 01 June 2019
    Published in TIST Volume 11, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Geosocial networks
    2. co-clustering
    3. mapping clusterability
    4. non-negative matrix factorization
    5. social similarity
    6. spatial similarity

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National Research Foundation of Korea (NRF)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Effective Clustering on Large Attributed Bipartite GraphsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671764(3782-3793)Online publication date: 25-Aug-2024
    • (2024)Context-aware Community Detection in the Russia-Ukraine Conflict NetworkProceedings of the 25th International Conference on Distributed Computing and Networking10.1145/3631461.3632089(328-333)Online publication date: 4-Jan-2024
    • (2023)Spatial-Aware Local Community Detection Guided by Dominance RelationIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.320997610:2(686-699)Online publication date: Apr-2023
    • (2023)A Cross-Platform Instant Messaging User Association Method Based on Supervised LearningBig Data and Security10.1007/978-981-99-3300-6_6(63-79)Online publication date: 31-May-2023
    • (2022)A Low-Cost High-Performance Semantic and Physical Distance Calculation Method Based on ZIP CodeIEICE Transactions on Information and Systems10.1587/transinf.2021DAP0005E105.D:5(920-927)Online publication date: 1-May-2022
    • (2022)ABCProceedings of the VLDB Endowment10.14778/3547305.354731815:10(2134-2147)Online publication date: 1-Jun-2022
    • (2022)A Cross-Platform Instant Messaging User Association Method Based on Spatio-temporal TrajectoryAdvances in Artificial Intelligence and Security10.1007/978-3-031-06761-7_35(430-444)Online publication date: 8-Jul-2022
    • (2021)NEDProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482455(627-637)Online publication date: 26-Oct-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media