Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Information Integration for Graph Databases

  • Chapter
  • First Online:
Link Mining: Models, Algorithms, and Applications

Abstract

With increasing interest in querying and analyzing graph data from multiple sources, algorithms and tools to integrate different graphs become very important. Integration of graphs can take place at the schema and instance levels. While links among graph nodes pose additional challenges to graph information integration, they can also serve as useful features for matching nodes representing real-world entities. This chapter introduces a general framework to perform graph information integration. It then gives an overview of the state-of-the-art research and tools in graph information integration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.flickr.com.

  2. 2.

    http://www.youtube.com

  3. 3.

    http://answers.yahoo.com.

  4. 4.

    MIPT is the acronym of Memorial Institute for the Prevention of Terrorism.

References

  1. O. Benjelloun, H. Garcia-Molina, D. Menestrina, Q. Su, S. E. Whang, and J. Widom. Swoosh: A generic approach to entity resolution. VLDB Journal, 18(1):255–276, 2009.

    Article  Google Scholar 

  2. I. Bhattacharya and L. Getoor. A latent dirichlet model for unsupervised entity resolution. In SIAM Conference on Data Mining, Bethesda, Maryland, USA, 2006.

    Google Scholar 

  3. I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1(1), 2007.

    Google Scholar 

  4. M. Bilgic, L. Licamele, L. Getoor, and B. Shneiderman. D-dupe: An interactive tool for entity resolution in social networks. In International Symposium on Graph Drawing, volume 3843 of Lecture Notes in Computer Science, pages 505–507, September 2005.

    Google Scholar 

  5. P. Buneman. Semistructured data. In ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Tucson, Arizona, 1997.

    Google Scholar 

  6. P. Chen. The entity-relationship model—toward a unified view of data. ACM Transactions on Database Systems, 1(1):9–36, 1976.

    Article  Google Scholar 

  7. W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string distance metrics for name-matching tasks. In IJCAI Workshop on Information Integration, pages 73–78, Acapulco, Mexico, August 2003.

    Google Scholar 

  8. P. Domingos. Multi-relational record linkage. In KDD-2004 Workshop on Multi-Relational Data Mining, pages 31–48, Seattle, Washington, 2004.

    Google Scholar 

  9. J.-D. Fekete, G. Grinstein, and C. Plaisant. The history of infovis. In IEEE InfoVis 2004 Contest, www.cs.umd.edu/hcil/iv04contest, Austin, Texas, 2004.

  10. M. Fernandez, D. Florescu, J. Kang, A. Levy, and D. Suciu. Strudel: a web site management system. In ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, 1997.

    Google Scholar 

  11. N. Guarino. Formal Ontology in Information Systems, chapter Formal Ontology in Information Systems. IOS Press, Amsterdam, 1998.

    Google Scholar 

  12. G. Jeh and J. Widom. Simrank: A measure of structural-context similarity. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 538–543, Edmonton, Alberta, Canada, 2002.

    Google Scholar 

  13. A. Jhingran, N. Mattos, and H. Pirahesh. Information integration: A research agenda. IBM Systems Journal, 41(4):555–562, 2002.

    Article  Google Scholar 

  14. L. Jin, C. Li, and S. Mehrotra. Efficient record linkage in large data sets. In International Conference on Database Systems for Advanced Applications, Kyoto, Japan, 2003.

    Google Scholar 

  15. E.-P. Lim and J. Srivastava. Query optimization and processing in federated database systems. In ACM Conference on Information and Knowledge Management, pages 720–722, Washington D.C., 1993.

    Google Scholar 

  16. E.-P. Lim, J. Srivastava, S. Prabhakar, and J. Richardson. Entity identification in database integration. In IEEE International Conference on Data Engineering, pages 294–301, Vienna, Austria, 1993.

    Google Scholar 

  17. E.-P. Lim, J. Srivastava, and S. Shekhar. An evidential reasoning approach to attribute value conflict resolution in database integration. IEEE Transactions on Knowledge and Data Engineering, 8(5):707–723, 1996.

    Article  Google Scholar 

  18. W. Litwin, L. Mark, and N. Roussopoulos. Interoperability of multiple autonomous databases. ACM Computing Survey, 22(3):267–293, 1990.

    Article  Google Scholar 

  19. Maureen, A. Sun, E.-P. Lim, A. Datta, and K. Chang. On visualizing heterogeneous semantic networks from multiple data sources. In International Conference on Asian Digital Libraries, pages 266–275, Bali, Indonesia, 2008.

    Google Scholar 

  20. J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom. Lore: A database management system for semistructured data. SIGMOD Record, 26(3), 1997.

    Google Scholar 

  21. A. Sheth and J. Larson. Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Survey, 22(3):183–236, 1990.

    Article  Google Scholar 

  22. S. Spaccapietra and C. Parent. View integration: A step forward in solving structural conflicts. IEEE Transactions on Knowledge and Data Engineering, 6(2):258–274, 1994.

    Article  Google Scholar 

  23. P. Treeratpituk and C. L. Giles. Disambiguating authors in academic publications using random forests. In Joint Conference in Digital Libraries, Austin, Texas, June 2009.

    Google Scholar 

  24. P. Ziegler and K. R. Dittrich. Three decades of data integration — all problems solved? In 18th IFIP World Computer Congress (WCC 2004), pages 3–12, Toulouse, France, 2004.

    Google Scholar 

Download references

Acknowledgements

We would like to acknowledge the support by A*STAR Public Sector R&D, Singapore, Project Number 062 101 0031 in the SSNet Project. We also thank Maureen and Nelman Lubis Ibrahim for implementing the SSnetViz system.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ee-Peng Lim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Lim, EP., Sun, A., Datta, A., Chang, K. (2010). Information Integration for Graph Databases. In: Yu, P., Han, J., Faloutsos, C. (eds) Link Mining: Models, Algorithms, and Applications. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6515-8_10

Download citation

Publish with us

Policies and ethics