Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Integration of XML schemas at various "severity" levels

Published: 01 September 2006 Publication History

Abstract

This paper presents a novel approach for the integration of a set of eXtensible Markup Language (XML) Schemas. The proposed approach is specialized for XML, almost automatic, semantic and "light". Further, an original peculiarity is that it is parametric w.r.t, a "severity" level against which the integration task is performed. The paper describes the approach in all details, illustrates various theoretical results, presents the experiments we have performed for testing it and, finally, compares it with various related approaches already proposed in the literature.

References

[1]
{1} S. Castano, V. De Antonellis, S. De Capitani di Vimercati, Global viewing of heterogeneous data sources, Trans. Data Knowledge Eng. 13 (2) (2001) 277-297.]]
[2]
{2} P. Fankhauser, M. Kracker, E.J. Neuhold, Semantic vs. structural resemblance of classes, ACM SIGMOD Record 20 (4) (1991) 59-63.]]
[3]
{3} J. Madhavan, P.A. Bernstein, E. Rahm, Generic schema matching with Cupid, in: Proceedings of the International Conference on Very Large Data Bases (VLDB 2001), Roma, Italy, Morgan Kaufmann, Los Altos, CA, 2001, pp. 49-58.]]
[4]
{4} L. Palopoli, D. Saccà, G. Terracina, D. Ursino, Uniform techniques for deriving similarities of objects and subschemes in heterogeneous databases, IEEE Trans. Knowledge Data Eng. 15 (2) (2003) 271-294.]]
[5]
{5} L. Palopoli, G. Terracina, D. Ursino, A graph-based approach for extracting terminological properties of elements of XML documents, in: Proceedings of the International Conference on Data Engineering (ICDE 2001), Heidelberg, Germany, IEEE Computer Society, Los Altos, CA, 2001, pp. 330-337.]]
[6]
{6} E. Rahm, P.A. Bernstein, A survey of approaches to automatic schema matching, VLDB J. 10 (4) (2001) 334-350.]]
[7]
{7} A. Doan, P. Domingos, A. Halevy, Reconciling schemas of disparate data sources: a machine-learning approach, in: Proceedings of the International Conference on Management of Data (SIGMOD 2001), Santa Barbara, CA, USA, ACM Press, New York, 2001, pp. 509-520.]]
[8]
{8} A.G. Miller, WordNet: a lexical database for English, Commun. Assoc. Comput. Mach. 38 (11) (1995) 39-41.]]
[9]
{9} Z. Galil, Efficient algorithms for finding maximum matching in graphs, ACM Comput. Surveys 18 (1986) 23-38.]]
[10]
{10} XML Schema, part 1: structures, W3C Recommendation, http://www.w3.org/TR/xmlschema-1, 2001.]]
[11]
{11} C. Batini, M. Lenzerini, A methodology for data schema integration in the entity relationship model, IEEE Trans. Software Eng. 10 (6) (1984) 650-664.]]
[12]
{12} D. Rosaci, G. Terracina, D. Ursino, A framework for abstracting data sources having heterogeneous representation formats, Data Knowledge Eng. 48 (1) (2004) 1-38.]]
[13]
{13} K. Passi, L. Lane, S.K. Madria, B.C. Sakamuri, M.K. Mohania, S.S. Bhowmick, A model for XML Schema integration, in: Proceedings of the International Conference on E-Commerce and Web Technologies (EC-Web 2002), Aix-en-Provence, France, Lecture Notes in Computer Science, Springer, Berlin, 2002, pp. 193-202.]]
[14]
{14} M.L. Lee, L.H. Yang, W. Hsu, X. Yang, XClust: clustering XML schemas for effective integration, in: Proceedings of the International Conference on Information and Knowledge Management (CIKM 2002), McLean, Virginia, USA, ACM Press, New York, 2002, pp. 292-299.]]
[15]
{15} H. Do, S. Melnik, E. Rahm, Comparison of schema matching evaluations, in: Proceedings of the International Workshop on Web, Web-Services, and Database Systems, Erfurt, Germany, Lecture Notes in Computer Science, Springer, 2002, pp. 221-237.]]
[16]
{16} J. Berlin, A. Motro, Autoplex: automated discovery of content for virtual databases, in: Proceedings of the International Conference on Cooperative Information Systems (CoopIS 2001), Trento, Italy, Lecture Notes in Computer Science, Springer, Berlin, 2001, pp. 108-122.]]
[17]
{17} J. Berlin, A. Motro, Database schema matching using machine learning with feature selection, in: Proceedings of the International Conference on Advanced Information Systems Engineering (CAiSE 2002), Toronto, Canada, Lecture Notes in Computer Science, Springer, Berlin, 2002, pp. 452-466.]]
[18]
{18} H. Do, E. Rahm, COMA--a system for flexible combination of schema matching approaches, in: Proceedings of the International Conference on Very Large Databases (VLDB 2002), Hong Kong, China, VLDB Endowment, 2002, pp. 610-621.]]
[19]
{19} A. Doan, J. Madhavan, P. Domingos, A. Halevy, Learning to map between ontologies on the Semantic Web, in: Proceedings of the International Conference on World Wide Web (WWW 2002), Honolulu, HI, USA, ACM Press, New York, 2002, pp. 662-673.]]
[20]
{20} W. Li, C. Clifton, SEMINT: a tool for identifying attribute correspondences in heterogeneous databases using neural networks, Data Knowledge Eng. 33 (1) (2000) 49-84.]]
[21]
{21} S. Melnik, H. Garcia-Molina, E. Rahm, Similarity flooding: a versatile graph matching algorithm and its application to schema matching, in: Proceedings of the International Conference on Data Engineering (ICDE 2002), San Jose, CA, USA, IEEE Computer Society Press, Silver Spring, MD, 2002, pp. 117-128.]]
[22]
{22} C.J. van Rijsbergen, Information Retrieval, Butterworth, London, 1979.]]
[23]
{23} X. Yang, M. Li Lee, T. Wang Ling, Resolving structural conflicts in the integration of XML schemas: a semantic approach, in: Proceedings of the International Conference on Conceptual Modeling (ER 2003), Chicago, IL, USA, Lecture Notes in Computer Science, Springer, Berlin, 2003, pp. 520-533.]]
[24]
{24} S. Melnik, E. Rahm, P.A. Bernstein, Rondo: a programming platform for generic model management, in: Proceedings of the International Conference on Management of Data (SIGMOD 2003), San Diego, CA, USA, ACM Press, New York, 2003, pp. 193-204.]]
[25]
{25} J. Kang, J.F. Naughton, On schema matching with opaque column names and data values, in: Proceedings of the ACM International Conference on Management of Data (SIGMOD 2003), San Diego, CA, USA, ACM Press, New York, 2003, pp. 205-216.]]
[26]
{26} R. dos Santos Mello, S. Castano, C.A. Heuser, A method for the unification of XML schemata, Inform. Software Technol. 44 (4) (2002) 241-249.]]
[27]
{27} T. Halpin, Object-role modeling (ORM-NIAM), in: P. Bernus, K. Mertins, G. Schmidt (Eds.), Handbook on Architectures of Information Systems, Springer, Berlin, 1998, pp. 81-102 (Chapter 4).]]
[28]
{28} P. Mitra, G. Wiederhold, J. Jannink, Semi-automatic integration of knowledge sources, in: Proceedings of Fusion'99, Sunnyvale, CA, USA, 1999.]]
[29]
{29} P. Rodriguez-Gianolli, J. Mylopoulos, A semantic approach to XML-based data integration, in: Proceedings of the International Conference on Conceptual Modelling (ER'01), Yokohama, Japan, Lecture Notes in Computer Science, Springer, Berlin, 2001, pp. 117-132.]]
[30]
{30} S. Lim, Y. Ng, Semantic integration of semistructured data, in: Proceedings of the International Symposium on Cooperative Database Systems and Applications (CODAS'01), Beijing, China, IEEE Computer Society Press, Silver Spring, MD, 2001, pp. 15-24.]]
[31]
{31} B. He, K. Chen-Chuan Chang, Statistical schema matching across Web query interfaces, in: Proceedings of the ACM International Conference on Management of Data (SIGMOD 2003), San Diego, CA, USA, ACM Press, New York, 2003, pp. 217-228.]]
[32]
{32} S. Castano, V. De Antonellis, A. Ferrara, G. Kuruvilla, Ontology-based integration of heterogeneous XML data-sources, in: Atti del Decimo Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD'02), Portoferraio, Italy, 2002, pp. 27-41.]]
[33]
{33} S. Bergamaschi, S. Castano, M. Vincini, Semantic integration of semistructured and structured data sources, SIGMOD Record 28 (1) (1999) 54-59.]]

Cited By

View all
  • (2024)Representation, detection and usage of the content semantics of comments in a social platformJournal of Information Science10.1177/0165551522108766350:2(317-341)Online publication date: 1-Apr-2024
  • (2022)Element Similarity Calculator in XML Schema MatchingProceedings of the 27th European Conference on Pattern Languages of Programs10.1145/3551902.3551970(1-10)Online publication date: 6-Jul-2022
  • (2021)An Approach to Extracting Topic-guided Views from the Sources of a Data LakeInformation Systems Frontiers10.1007/s10796-020-10010-x23:1(243-262)Online publication date: 1-Feb-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Information Systems
Information Systems  Volume 31, Issue 6
September 2006
171 pages

Publisher

Elsevier Science Ltd.

United Kingdom

Publication History

Published: 01 September 2006

Author Tags

  1. XML schemas
  2. homonymies
  3. information source integration
  4. synonymies

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Representation, detection and usage of the content semantics of comments in a social platformJournal of Information Science10.1177/0165551522108766350:2(317-341)Online publication date: 1-Apr-2024
  • (2022)Element Similarity Calculator in XML Schema MatchingProceedings of the 27th European Conference on Pattern Languages of Programs10.1145/3551902.3551970(1-10)Online publication date: 6-Jul-2022
  • (2021)An Approach to Extracting Topic-guided Views from the Sources of a Data LakeInformation Systems Frontiers10.1007/s10796-020-10010-x23:1(243-262)Online publication date: 1-Feb-2021
  • (2020)An approach to evaluate trust and reputation of things in a Multi-IoTs scenarioComputing10.1007/s00607-020-00818-5102:10(2257-2298)Online publication date: 5-May-2020
  • (2019)Transforming XML schemas into OWL ontologies using formal concept analysisSoftware and Systems Modeling (SoSyM)10.1007/s10270-017-0651-418:3(2093-2110)Online publication date: 1-Jun-2019
  • (2018)Mining Abstract XML Data-TypesACM Transactions on the Web10.1145/326746713:1(1-37)Online publication date: 4-Dec-2018
  • (2016)From Diversity-based Prediction to Better Ontology & Schema MatchingProceedings of the 25th International Conference on World Wide Web10.1145/2872427.2882999(1145-1155)Online publication date: 11-Apr-2016
  • (2013)Schema matching prediction with applications to data source discovery and dynamic ensemblingThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-013-0325-y22:5(689-710)Online publication date: 1-Oct-2013
  • (2012)Evaluating PageRank methods for structural sense ranking in labeled tree dataProceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics10.1145/2254129.2254174(1-12)Online publication date: 13-Jun-2012
  • (2010)A framework for XML schema integration via conceptual modelProceedings of the 2010 international conference on Web information systems engineering10.5555/2044492.2044502(84-97)Online publication date: 12-Dec-2010
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media