Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2488388.2488479acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Mining collective intelligence in diverse groups

Published: 13 May 2013 Publication History

Abstract

Collective intelligence, which aggregates the shared information from large crowds, is often negatively impacted by unreliable information sources with the low quality data. This becomes a barrier to the effective use of collective intelligence in a variety of applications. In order to address this issue, we propose a probabilistic model to jointly assess the reliability of sources and find the true data. We observe that different sources are often not independent of each other. Instead, sources are prone to be mutually influenced, which makes them dependent when sharing information with each other. High dependency between sources makes collective intelligence vulnerable to the overuse of redundant (and possibly incorrect) information from the dependent sources. Thus, we reveal the latent group structure among dependent sources, and aggregate the information at the group level rather than from individual sources directly. This can prevent the collective intelligence from being inappropriately dominated by dependent sources. We will also explicitly reveal the reliability of groups, and minimize the negative impacts of unreliable groups. Experimental results on real-world data sets show the effectiveness of the proposed approach with respect to existing algorithms.

References

[1]
Y. Bachrach, T. Minka, J. Guiver, and T. Graepel. How to grade a test without knowing the answers - a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In Proc. of International Conference on Machine Learning, 2012.
[2]
M. Bilgic, G. Namata, and L. Getoor. Combining collective classification and link prediction. In Workshop on Mining Graphs and Complex Structures (at ICDM), 2007.
[3]
A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.
[4]
X. L. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: The role of source dependence. In Proc. of International Conference on Very Large Databases, August 2009.
[5]
A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroborating information from disagreeing views. In Proc. of ACM International Conference on Web Search and Data Mining, February 2010.
[6]
L. Getoor, N. Friedman, D. Koller, and B. Taskar. Learning probabilistic models of link structure. Journal of Machine Learning Research, (3):679--707, 2002.
[7]
M. Girvan and M. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12):7821--7826, June 2002.
[8]
M. Gupta, Y. Sun, and J. Han. Trust analysis with clustering. In Proc. of International World Wide Web Conference, April 2011.
[9]
O. Hassanzadeh and et al. A framework for semantic link discovery over relational data. In CIKM, 2009.
[10]
M. Jordan, Z. Ghahramani, T. Jaakkola, and L. Saul. Introduction to variational methods for graphical models. Machine Learning, 37:183--233, 1999.
[11]
G. Kasneci, J. V. Gael, D. Stern, and T. Graepel. Cobayes: Bayesian knowledge corroboration with assessors of unknown areas of expertise. In Proc. of ACM International Conference on Web Search and Data Mining, 2011.
[12]
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, August 2009.
[13]
K. Kurihara, M. Welling, and N. Vlassis. Accelerated variational dirichlet process mixtures. In NIPS, 2006.
[14]
J. Pasternack and D. Roth. Knowing what to believe (when you already know something). In Proc. of International Conference on Computational Linguistics, August 2010.
[15]
J. Sethuraman. A constructive definition of dirichlet priors. Statistica Sinica, 4:639--650, 1994.
[16]
X. Yin, J. Han, and P. S. Yu. Truth discovery with multiple conflicting information providers on the web. In Proc. of ACM SIGKDD conference on Knowledge Discovery and Data Mining, August 2007.
[17]
X. Yin and W. Tan. Semi-supervised truth discovery. In Proc. of International World Wide Web Conference, March 28-April 1 2011.
[18]
B. Zhao, B. I. P. Rubinstein, J. Gemmell, and J. Han. A bayesian approach to discovering truth from conflicting sources for data integration. In Proc. of International Conference on Very Large Databases, 2012.
[19]
X. Zhou, N. Cui, Z. Li, F. Liang, and T. Huang. Hierarchical gaussianization for image classification, 2009.

Cited By

View all
  • (2023)Turning the Cacophony of the Internet's Tower of Babel into a Coherent General Collective IntelligenceInternational Journal of Crowd Science10.26599/IJCS.2022.91000357:2(55-62)Online publication date: Jun-2023
  • (2023)Secure and Lightweight Blockchain-based Truthful Data Trading for Real-Time Vehicular CrowdsensingACM Transactions on Embedded Computing Systems10.1145/358200823:1(1-31)Online publication date: 25-Jan-2023
  • (2023)TIRA: Truth Inference via Reliability Aggregation on Object-Source GraphIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322530835:11(11967-11981)Online publication date: 1-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '13: Proceedings of the 22nd international conference on World Wide Web
May 2013
1628 pages
ISBN:9781450320351
DOI:10.1145/2488388

Sponsors

  • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
  • CGIBR: Comite Gestor da Internet no Brazil

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. collective intelligence
  2. crowdsourcing
  3. robust classifier

Qualifiers

  • Research-article

Conference

WWW '13
Sponsor:
  • NICBR
  • CGIBR
WWW '13: 22nd International World Wide Web Conference
May 13 - 17, 2013
Rio de Janeiro, Brazil

Acceptance Rates

WWW '13 Paper Acceptance Rate 125 of 831 submissions, 15%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)3
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Turning the Cacophony of the Internet's Tower of Babel into a Coherent General Collective IntelligenceInternational Journal of Crowd Science10.26599/IJCS.2022.91000357:2(55-62)Online publication date: Jun-2023
  • (2023)Secure and Lightweight Blockchain-based Truthful Data Trading for Real-Time Vehicular CrowdsensingACM Transactions on Embedded Computing Systems10.1145/358200823:1(1-31)Online publication date: 25-Jan-2023
  • (2023)TIRA: Truth Inference via Reliability Aggregation on Object-Source GraphIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322530835:11(11967-11981)Online publication date: 1-Nov-2023
  • (2023)Label Enhanced Graph Attention Network for Truth InferenceArtificial Neural Networks and Machine Learning – ICANN 202310.1007/978-3-031-44216-2_38(460-471)Online publication date: 22-Sep-2023
  • (2022)A Joint Maximum Likelihood Estimation Framework for Truth Discovery: A Unified PerspectiveIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3173911(1-1)Online publication date: 2022
  • (2022)IntroductionKnowledge Discovery from Multi-Sourced Data10.1007/978-981-19-1879-7_1(1-11)Online publication date: 14-Jun-2022
  • (2021)GGATB-LSTM: Grouping and Global Attention-based Time-aware Bidirectional LSTM Medical Treatment Behavior PredictionACM Transactions on Knowledge Discovery from Data10.1145/344145415:3(1-16)Online publication date: 3-May-2021
  • (2021)Probabilistic model for truth discovery with mean and median check frameworkKnowledge-Based Systems10.1016/j.knosys.2021.107482233:COnline publication date: 5-Dec-2021
  • (2020)CrowdWTACM Transactions on Knowledge Discovery from Data10.1145/342171215:1(1-24)Online publication date: 7-Dec-2020
  • (2020)Constrained Truth DiscoveryIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.2982393(1-1)Online publication date: 2020
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media