Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1178745.1178757acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Discovering groups of people in Google news

Published: 27 October 2006 Publication History

Abstract

In this paper, we study the problem of content-based social network discovery among people who frequently appear in world news. Google news is used as the source of data. We describe a probabilistic framework for associating people with groups. A low-dimensional topic-based representation is first obtained for news stories via probabilistic latent semantic analysis (PLSA). This is followed by construction of semantic groups by clustering such representations. Unlike many existing social network analysis approaches, which discover groups based only on binary relations (e.g. co-occurrence of people in a news article), our model clusters people using their topic distribution, which introduces contextual information in the group formation process (e.g. some people belong to several groups depending on the specific subject). The model has been used to study evolution of people with respect to topics over time. We also illustrate the advantages of our approach over a simple co-occurrence-based social network extraction method.

References

[1]
R. Baeza-Yates and B. Ribeiro-Neto, "Modern Information Retrieval," ACM Press, 1999.
[2]
A. L. Barabasi, "Linked: The New Science of Networks," Perseus, 2002.
[3]
A. L. Barabasi, R. Albert, H. Jeong, "Scale-free Characteristics of Random Networks: the Topology of the World Wide Web," Physica, 281, 69--77, 2000.
[4]
T. Berg, A. Berg, J. Edwards, M. Maire, R.White, Y.-W. Teh, E.L. Miller, and D. Forsyth, "Faces and Names in the News," Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2004.
[5]
D. Blei, A. Ng, and M. I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, vol. 3, 993--1022, 2003.
[6]
T. Choudhury and A. Pentland, "Sensing and Modeling Human Networks using the Sociometer," Proc. Int. Conf. on Wearable Computing, 2003.
[7]
J. Golbeck, B. Parsia, and J. Hendler, "Trust Networks on the Semantic Web," Lecture Notes in Computer Science, 238--249, 2003.
[8]
T. Hofmann, "Unsupervised Learning by Probabilistic Latent Semantic Analysis," Machine Learning, vol. 42, 177--196, 2001.
[9]
B. Klimt and Y. Tang, "The Enron Corpus: A New Dataset for Email Classification Research" Proc. European Conf. on Machine Learning, 2004.
[10]
D. Lewis, "Reuters-21578 Test Categorization Test Collection," http://kdd.ics.uci.edu/databases/reuters21578.html, 1997.
[11]
A. McCallum, A. Corrada-Emmanuel, and X. Wang, "Topic and Role Discovery in Social Networks," Proc. Int. Joint Conf. on Artificial Intelligence, 2005.
[12]
M. E. J. Newman, "Coauthorship Networks and Patterns of Scientific Collaboration," Proc. National Academy of Sciences, 2004.
[13]
A. Ng, M. I. Jordan, and Y. Weiss, "On Spectral Clustering: Analysis and an Algorithm," Proc. Neural Information Processing, 2001.
[14]
K. Nowicki and T. A. B. Snijders, "Estimation and Prediction for Stochastic Blockstructures," Journal of American Statistical Association, vol. 96, 1077--1087, 2001.
[15]
A. Ozgur and H. Bingol, "Social Networks of Co-occurence in News Articles," Lecture Notes in Computer Science, 3280, 688--695, 2004.
[16]
A. Pentland, "Socially Aware Computation and Communication," IEEE Computer, vol. 38, 33--40, 2005.
[17]
J. Preece, "Online Communities: Designing Usability and Supporting Sociability," John Wiley & Sons, 2000.
[18]
M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth, "The Author Topic Model for Authors and Documents," Proc. 20th conference on Uncertainty in Artificial Intelligence, 2004.
[19]
N. A. Van House and M. Davis, "The Social Life of Cameraphone Images,". Proc. Pervasive Image Capture and Sharing Workshop at the Seventh Int. Conf. on Ubiquitous Computing, 2005.
[20]
X. Wang, N. Mohanty, and A. McCallum, "Group and Topic Discovery from Relations and Text," Proc. KDD Workshop on Link Discovery: Issues, Approaches and Applications, 2005.
[21]
S. Wasserman and K. Faust, "Social Network Analysis: Methods and Applications," Cambridge University Press, 1994.

Cited By

View all
  • (2024)La visión de Google News desde la academia: scoping reviewDoxa Comunicación. Revista Interdisciplinar de Estudios de Comunicación y Ciencias Sociales10.31921/doxacom.n38a1891Online publication date: 1-Jan-2024
  • (2023)Evaluation framework for facilitating the technology transfers of universities: Focusing on the perspective of technology donorsPLOS ONE10.1371/journal.pone.029395118:12(e0293951)Online publication date: 14-Dec-2023
  • (2018)Bursty event detection from collaborative tagsWorld Wide Web10.1007/s11280-011-0136-215:2(171-195)Online publication date: 25-Dec-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HCM '06: Proceedings of the 1st ACM international workshop on Human-centered multimedia
October 2006
138 pages
ISBN:1595935002
DOI:10.1145/1178745
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. probabilistic latent semantic indexing
  2. social network analysis
  3. text mining
  4. topic evolution

Qualifiers

  • Article

Conference

MM06
MM06: The 14th ACM International Conference on Multimedia 2006
October 27, 2006
California, Santa Barbara, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)La visión de Google News desde la academia: scoping reviewDoxa Comunicación. Revista Interdisciplinar de Estudios de Comunicación y Ciencias Sociales10.31921/doxacom.n38a1891Online publication date: 1-Jan-2024
  • (2023)Evaluation framework for facilitating the technology transfers of universities: Focusing on the perspective of technology donorsPLOS ONE10.1371/journal.pone.029395118:12(e0293951)Online publication date: 14-Dec-2023
  • (2018)Bursty event detection from collaborative tagsWorld Wide Web10.1007/s11280-011-0136-215:2(171-195)Online publication date: 25-Dec-2018
  • (2016)A Content-Based Approach to Social Network Analysis: A Case Study on Research CommunitiesDigital Libraries on the Move10.1007/978-3-319-41938-1_15(142-154)Online publication date: 1-Jul-2016
  • (2016)Structure of a Media Co-occurrence NetworkProceedings of ECCS 201410.1007/978-3-319-29228-1_8(81-91)Online publication date: 4-May-2016
  • (2015)Management of duplicate members on websitesProceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 201510.1145/2808797.2808815(1104-1109)Online publication date: 25-Aug-2015
  • (2015)Modelling the User Modelling Community (and Other Communities as Well)User Modeling, Adaptation and Personalization10.1007/978-3-319-20267-9_31(357-363)Online publication date: 11-Jun-2015
  • (2014)Targeted Advertising Based on Social Network AnalysisApplied Mechanics and Materials10.4028/www.scientific.net/AMM.488-489.1306488-489(1306-1309)Online publication date: Jan-2014
  • (2011)Extracting multi-dimensional relationsProceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2063750(1203-1208)Online publication date: 24-Oct-2011
  • (2009)Analyzing Collaborations Through Content-Based Social NetworksComputational Social Network Analysis10.1007/978-1-84882-229-0_15(387-409)Online publication date: 6-Nov-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media