Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Personalized emerging topic detection based on a term aging model

Published: 03 January 2014 Publication History

Abstract

Twitter is a popular microblogging service that acts as a ground-level information news flashes portal where people with different background, age, and social condition provide information about what is happening in front of their eyes. This characteristic makes Twitter probably the fastest information service in the world. In this article, we recognize this role of Twitter and propose a novel, user-aware topic detection technique that permits to retrieve, in real time, the most emerging topics of discussion expressed by the community within the interests of specific users. First, we analyze the topology of Twitter looking at how the information spreads over the network, taking into account the authority/influence of each active user. Then, we make use of a novel term aging model to compute the burstiness of each term, and provide a graph-based method to retrieve the minimal set of terms that can represent the corresponding topic. Finally, since any user can have topic preferences inferable from the shared content, we leverage such knowledge to highlight the most emerging topics within her foci of interest. As evaluation we then provide several experiments together with a user study proving the validity and reliability of the proposed approach.

References

[1]
Abrol, S. and Khan, L. 2010. Twinner: Understanding news queries with geocontent using twitter. In Proceedings of the 6th Workshop on Geographic Information Retrieval (GIR'10). ACM Press, New York, 1--8.
[2]
Acar, A. and Muraki, Y. 2011. Twitter for crisis communication: Lessons learned from japan's tsunami disaster. Int. J. Web Based Communities 7, 3, 392--402.
[3]
Agrawal, R., Gollapudi, S., Halverson, A., and Ieong, S. 2009. Diversifying search results. In Proceedings of the 2nd ACM International Conference on Web Search and Data Mining (WSDM'09). ACM Press, New York, 5--14.
[4]
Allan, J. 2002. Topic Detection and Tracking: Event-Based Information Organization. Kluwer Academic Publishers, Norwell, MA.
[5]
Alsumait, L., Barbara, D., and Domeniconi, C. 2008. On-line lda: Adaptive topic models for mining text streams with applications to topic detection and tracking. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM'08). 3--12.
[6]
Asur, S., Huberman, B. A., Szabo, G., and Wang, C. 2011. Trends in social media: Persistence and decay. In Proceedings of the 5th AAAI International Conference on Weblogs and Social Media. AAAI Press.
[7]
Bakshy, E., Hofman, J. M., Mason, W. A., and Watts, D. J. 2011. Everyone's an influencer: Quantifying influence on twitter. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM'11). ACM Press, New York, 65--74.
[8]
Balabanovic, M. and Shoham, Y. 1997. Fab: Content-based, collaborative recommendation. Comm. ACM 40, 66--72.
[9]
Barabasi, A. L., Jeong, H., Neda, Z., Ravasz, E., Schubert, A., and Vicsek, T. 2002. Evolution of the social network of scientific collaborations. Physica A: Statist. Mech. Appl. 311, 3--4, 590--614.
[10]
Becker, H., Naaman, M., and Gravano, L. 2010. Learning similarity metrics for event identification in social media. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM'10). ACM Press, New York, 291--300.
[11]
Becker, H., Naaman, M., and Gravano, L. 2011. Beyond trending topics: Realworld event identification on twitter. In Proceedings of the 5th AAAI International Conference on Weblogs and Social Media. AAAI Press.
[12]
Bun, K. K., Ishizuka, M., and Ishizuka, B. M. 2002. Topic extraction from news archive using tf*pdf algorithm. In Proceedings of 3rd International Conference on Web Information Systems Engineering (WISE'02). 73--82.
[13]
Cantador, I., Bellogin, A., and Vallet, D. 2010. Content-based recommendation in social tagging systems. In Proceedings of the 4th ACM Conference on Recommender Systems (RecSys'10). ACM Press, New York, 237--240.
[14]
Carmel, D., Zwerdling, N., Guy, I., Ofek-Koifman, S., Harel, N., Ronen, I., Uziel, E., Yogev, S., and Chernov, S. 2009. Personalized social search based on the user's social network. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM'09). ACM Press, New York, 1227--1236.
[15]
Castillo, C., Mendoza, M., and Poblete, B. 2011. Information credibility on twitter. In Proceedings of the 20th International Conference on World Wide Web (WWW'11). ACM Press, New York, 675--684.
[16]
Cataldi, M., Di Caro, L., and Schifanella, C. 2010. Emerging topic detection on twitter based on temporal and social terms evaluation. In Proceedings of the 10th International Workshop on Multimedia Data Mining (MDMKDD'10). ACM Press, New York, 4:1--4:10.
[17]
Cataldi, M., Schifanella, C., Candan, K. S., Sapino, M. L., and Di Caro, L. 2009. Cosena: A context-based search and navigation system. In Proceedings of the International Conference on Management of Emergent Digital EcoSystems (MEDES'09). ACM Press, New York, 33:218--33:225.
[18]
Cha, M., Haddadi, H., Benevenuto, F., and Gummadi, K. P. 2010. Measuring user influence in twitter: The million follower fallacy. In Proceedings of the 4th AAAI International Conference on Weblogs and Social Media (ICWSM'10). AAAI Press, 10--17.
[19]
Chen, C. C., Chen, Y.-T., Sun, Y. S., and Chen, M. C. 2003. Life cycle modeling of news events using aging theory. In Proceedings of the 14th European Conference on Machine Learning (ECML'03). Springer, 47--59.
[20]
Chen, J., Geyer, W., Dugan, C., Muller, M., and Guy, I. 2009. Make new friends, but keep the old: Recommending people on social networking sites. In Proceedings of the 27th International Conference on Human Factors in Computing Systems (CHI'09). ACM Press, New York, 201--210.
[21]
Chen, J., Nairn, R., Nelson, L., Bernstein, M., and Chi, E. 2010. Short and tweet: Experiments on recommending content from information streams. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'10). ACM Press, New York, 1185--1194.
[22]
Chubin, D. E. 1976. The conceptualization of scientific specialties. Sociol. Quart. 17, 4, 448--476.
[23]
Crane, D. 1969. Social structure in a group of scientists: A test of the “invisible college” hypothesis. Amer. Sociol. Rev. 3, 335--352.
[24]
De Beaver, D. and Rosen, R. 1979. Studies in scientific collaboration. Scientometrics 1, 2, 133--149.
[25]
Di Caro, L., Cataldi, M., and Schifanella, C. 2012. The d-index: Discovering dependences among scientific collaborators from their bibliographic data records. Scientometrics 93, 3, 583--607.
[26]
Favenza, A., Cataldi, M., Sapino, M. L., and Messina, A. 2008. Topic development based refinement of audio-segmented television news. In Proceedings of the 13th International Conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems (NLDB'08). Springer, 226--232.
[27]
Gauch, S., Chaffee, J., and Pretschner, A. 2003. Ontology-based personalized search and browsing. Web Intell. Agent Syst. 1, 219--234.
[28]
Glance, N. S., Hurst, M., and Tomokiyo, T. 2004. BlogPulse: Automated trend discovery for weblogs. In Proceedings of the Workshop on the Weblogging Ecosystem. ACM Press, New York.
[29]
Goldberg, D., Nichols, D., Oki, B. M., and Terry, D. 1992. Using collaborative filtering to weave an information tapestry. Comm. ACM 35, 12, 61--70.
[30]
Goldenberg, J., Libai, B., and Muller, E. 2001. Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Lett. 12, 3, 211--223.
[31]
Goyal, A., Bonchi, F., and Lakshmanan, L. V. 2010. Learning influence probabilities in social networks. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM'10). ACM Press, New York, 241--250.
[32]
Granovetter, M. 1978. Threshold models of collective behavior. Amer. J. Sociol. 83, 6, 1420--1443.
[33]
Griffiths, T. L. and Steyvers, M. 2004. Finding scientific topics. Proc. Nat. Acad. Sci. 101, 1, 5228--5235.
[34]
Gruhl, D., Guha, R., Liben-Nowell, D., and Tomkins, A. 2004. Information diffusion through blogspace. In Proceedings of the 13th International Conference on World Wide Web (WWW'04). ACM Press, New York, 491--501.
[35]
Han, X., Shen, Z., Miao, C., and Luo, X. 2010. Folksonomy-based ontological user interest profile modeling and its application in personalized search. In Proceedings of the 6th International Conference on Active Media Technology (AMT'10). Springer, 34--46.
[36]
Hassan, A., Radev, D. R., Cho, J., and Joshi, A. 2009. Content based recommendation and summarization in the blogosphere. In Proceedings of the 3rd AAAI International Conference on Weblogs and Social Media (ICWSM'09). AAAI Press, 34--41.
[37]
He, Q., Chang, K., and Lim, E.-P. 2007. Using burstiness to improve clustering of topics in news streams. In Proceedings of the IEEE International Conference on Data Mining (ICDM'07). 493--498.
[38]
Hou, H., Kretschmer, H., and Liu, Z. 2008. The structure of scientific collaboration networks in scientometrics. Scientometrics 75, 2, 189--202.
[39]
Jaschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L., and Stumme, G. 2007. Tag recommendations in folksonomies. In Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'07). Springer, 506--514.
[40]
Katz, J. S., Katz, J. S., Martin, B. R., and Martin, B. R. 1997. What is research collaboration? Res. Policy 26, 1--18.
[41]
Kwak, H., Lee, C., Park, H., and Moon, S. 2010. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web (WWW'10). ACM Press, New York, 591--600.
[42]
Lampos, V. and Cristianini, N. 2012. Nowcasting events from the social web with statistical learning. ACM Trans. Intell. Syst. Technol. 3, 4, 72:1--72:22.
[43]
Leskovec, J. and Faloutsos, C. 2006. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06). ACM Press, New York, 631--636.
[44]
Liang, X., Chen, W., and Bu, J. 2010. Bursty feature based topic detction and summarization. In Proceedings of the 2nd International Conference on Computer Engineering and Technology (ICCET'10), vol. 6. 49--253.
[45]
Lin, G.-L., Peng, H., Ma, Q.-L., Wei, J., and Qin, J.-W. 2010. Improving diversity in web search results re-ranking using absorbing random walks. In Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC'10), vol. 5. 2116--2421.
[46]
Lu, R., Xu, Z., Zhang, Y., and Yang, Q. 2012. Life activity modeling of news event on twitter using energy function. In Proceedings of the 16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD'12). Springer, 73--84.
[47]
Makkonen, J., Ahonen-Myka, H., and Salmenkivi, M. 2004. Simple semantics in topic detection and tracking. Inf. Retr. 7, 3--4, 347--368.
[48]
Melin, G. and Persson, O. 1996. Studying research collaboration using coauthorships. Scientometrics 36, 363--377.
[49]
Melville, P., Mooney, R. J., and Nagarajan, R. 2001. Content-boosted collaborative filtering. In Proceedings of the SIGIR Workshop on Recommender Systems. ACM Press, New York, 16--23.
[50]
Moon, S., You, J., Kwak, H., Kim, D., and Jeong, H. 2010. Understanding topological mesoscale features in community mining. In Proceedings of the 2nd International Conference on Communication Systems and Networks (COMSNETS'10). 1--10.
[51]
Newman, M. E. J. 2001. Scientific collaboration networks. I. Network construction and fundamental results. Phys. Rev. E 64, 1.
[52]
Noll, M. G. and Meinel, C. 2007. Web search personalization via social bookmark. In Proceedings of the 6th International Semantic Web and 2nd Asian Semantic Web Conference (ISWC'07/ASWC'07). Springer, 367--380.
[53]
Page, L., Brin, S., Motwani, R., and Winograd, T. 1998. The pagerank citation ranking: Bringing order to the web. In Proceedings of the 7th International World Wide Web Conference (WWW'98). ACM Press, New York, 161--172.
[54]
Petrovic, S., Osborne, M., and Lavrenko, V. 2010. Streaming first story detection with application to twitter. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT'10). 181--189.
[55]
Poblete, B., Garcia, R., Mendoza, M., and Jaimes, A. 2011. Do all birds tweet the same? Characterizing twitter around the world. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM'11). ACM Press, New York, 1025--1030.
[56]
Ponzetto, S. P. and Strube, M. 2007. An api for measuring the relatedness of words in wikipedia. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions (ACL'07). 49--52.
[57]
Qi, Y. and Candan, K. S. 2006. Cuts: Curvature-based development pattern analysis and segmentation for blogs and other text streams. In Proceedings of the 17th Conference on Hypertext and Hypermedia (HYPERTEXT'06). ACM Press, New York, 1--10.
[58]
Radlinski, F. and Dumais, S. 2006. Improving personalized web search using result diversification. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'06). ACM Press, New York, 691--692.
[59]
Ruthven, I. and Lalmas, M. 2003. A survey on the use of relevance feedback for information access systems. Knowl. Engin. Rev. 18, 2, 95--145.
[60]
Salton, G. and Buckley, C. 1988. Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24, 5, 513--523.
[61]
Sankaranarayanan, J., Samet, H., Teitler, B., Lieberman, M., and Sperling, J. 2009. Twitterstand: News in tweets. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM Press, New York, 42--51.
[62]
Schifanella, C., Caro, L. D., Cataldi, M., and Aufaure, M.-A. 2012. The dindex: A web environment for analyzing dependences among scientific collaborators. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). ACM Press, New York, 1520--1523.
[63]
Shapin, S. 1981. Laboratory life. The social construction of scientific facts. Med. History 25, 3, 341--342.
[64]
Sieg, A., Mobasher, B., and Burke, R. 2007. Web search personalization with ontological user profiles. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM'07). ACM, New York, 525--534.
[65]
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., and Demirbas, M. 2010. Short text classification in twitter to improve information filtering. In Proceedings of the 33rd ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR'10). ACM Press, New York, 841--842.
[66]
Sugiyama, K., Hatano, K., and Yoshikawa, M. 2004. Adaptive web search based on user profile constructed without any effort from users. In Proceedings of the 13th International Conference on World Wide Web (WWW'04). ACM Press, New York, 675--684.
[67]
Takeshi Sakaki, M. O. and Matsuo, Y. 2010. Earthquake shakes twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web (WWW'10). ACM Press, New York, 851--860.
[68]
Teevan, J., Dumais, S., and Horvitz, E. 2005a. Beyond the commons: Investigating the value of personalizing web search. In Proceedings of the Workshop on New Technologies for Personalized Information Access (PIA'05). 84--92.
[69]
Teevan, J., Dumais, S. T., and Horvitz, E. 2005b. Personalizing search via automated analysis of interests and activities. In Proceedings of the 28th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR'05). ACM Press, New York, 449--456.
[70]
Treeratpituk, P. and Callan, J. 2006. Automatically labeling hierarchical clusters. In Proceedings of the International Conference on Digital Government Research. ACM Press, New York, 167--176.
[71]
Wang, C., Zhang, M., Ru, L., and Ma, S. 2008. Automatic online news topic ranking using media focus and user attention based on aging theory. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM'08). ACM Press, New York, 1033--1042.
[72]
Wang, Q. and Jin, H. 2010. Exploring online social activities for adaptive search personalization. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM'10). ACM Press, New York, 999--1008.
[73]
Wedig, S. and Madani, O. 2006. A large-scale analysis of query logs for assessing personalization opportunities. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06). ACM Press, New York, 742--747.
[74]
Weng, J., Lim, E.-P., Jiang, J., and He, Q. 2010. Twitterrank: Finding topic sensitive influential twitterers. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM'10). ACM Press, New York, 261--270.
[75]
Wu, Y., Ding, Y.,Wang, X., and Xu, J. 2010. On-line hot topic recommendation using tolerance rough set based topic clustering. J. Comput. 5, 4, 549--556.
[76]
Xu, S., Bao, S., Fei, B., Su, Z., and Yu, Y. 2008. Exploring folksonomy for personalized search. In Proceedings of the 31st Annual ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR'08). ACM Press, New York, 155--162.
[77]
Yang, J. and Counts, S. 2010. Predicting the speed, scale, and range of information diffusion in twitter. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM'10). 355--358.
[78]
Yang, J. and Leskovec, J. 2010. Modeling information diffusion in implicit networks. In Proceedings of the 10th International Conference on Data Mining (ICDM'10). 599--608.
[79]
Zhao, Q., Mitra, P., and Chen, B. 2007. Temporal and information flow based event detection from social text streams. In Proceedings of the 22nd National Conference on Artificial Intelligence. 1501--1506. AAAI Press.
[80]
Ziegler, C.-N., Mcnee, S. M., Konstan, J. A., and Lausen, G. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW'05). ACM Press, New York, 22--32.

Cited By

View all
  • (2024)Next Topic Recommendation for Influencers on Social Media2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825496(723-728)Online publication date: 15-Dec-2024
  • (2023)A systematic literature review of weak signal identification and evolution for corporate foresightKybernetes10.1108/K-03-2023-034353:10(3160-3188)Online publication date: 2-May-2023
  • (2023)Innovation signals: leveraging machine learning to separate noise from newsScientometrics10.1007/s11192-023-04672-y128:5(2649-2676)Online publication date: 1-May-2023
  • Show More Cited By

Index Terms

  1. Personalized emerging topic detection based on a term aging model

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 5, Issue 1
    Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
    December 2013
    520 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/2542182
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 January 2014
    Accepted: 01 October 2012
    Revised: 01 July 2012
    Received: 01 February 2012
    Published in TIST Volume 5, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Social network analysis
    2. Twitter
    3. aging theory
    4. topic detection and tracking
    5. trends

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 27 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Next Topic Recommendation for Influencers on Social Media2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825496(723-728)Online publication date: 15-Dec-2024
    • (2023)A systematic literature review of weak signal identification and evolution for corporate foresightKybernetes10.1108/K-03-2023-034353:10(3160-3188)Online publication date: 2-May-2023
    • (2023)Innovation signals: leveraging machine learning to separate noise from newsScientometrics10.1007/s11192-023-04672-y128:5(2649-2676)Online publication date: 1-May-2023
    • (2022)Developing insights from the collective voice of target users in TwitterJournal of Big Data10.1186/s40537-022-00611-59:1Online publication date: 2-Jun-2022
    • (2022)Hybrid Onion Layered System for the Analysis of Collective Subjectivity in Social NetworksIEEE Access10.1109/ACCESS.2022.321746710(115435-115468)Online publication date: 2022
    • (2021)Topic extraction to provide an overview of research activitiesJournal of Information Science10.1177/016555152092079447:5(590-608)Online publication date: 1-Oct-2021
    • (2021)Age Estimation Using Aging/Rejuvenation Features With Device-Edge SynergyIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2020.298111731:2(608-620)Online publication date: Mar-2021
    • (2020)Prediction Model for Information Dissemination in Social Network Media Based on Triangle Ring AttractorMathematical Problems in Engineering10.1155/2020/15639462020(1-11)Online publication date: 11-Feb-2020
    • (2019)Burst Topic Detection in Real Time Spatial–Temporal Data StreamIEEE Access10.1109/ACCESS.2019.29236827(82709-82720)Online publication date: 2019
    • (2018)A systematic literature review of mining weak signals and trends for corporate foresightJournal of Business Economics10.1007/s11573-018-0898-488:5(643-687)Online publication date: 19-Mar-2018
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media