Abstract
Forms are a standard way of gathering data into a database. Many applications need to support multiple users with evolving data gathering requirements. It is desirable to automatically link dynamic forms to the back-end database. We have developed the FormMapper system, a fully automatic solution that accepts user-created data entry forms, and maps and integrates them into an existing database in the same domain. The solution comprises of two components: tree extraction and form integration. The tree extraction component leverages a probabilistic process, Hidden Markov Model (HMM), for automatically extracting a semantic tree structure of a form. In the form integration component, we develop a merging procedure that maps and integrates a tree into an existing database and extends the database with desired properties. We conducted experiments evaluating the performance of the system on several large databases designed from a number of complex forms. Our experimental results show that the FormMapper system is promising: It generated databases that are highly similar (87% overlapped) to those generated by the human experts, given the same set of forms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Form Assembly, http://www.formassembly.com
Jotform, http://www.jotform.com/
Wufoo, http://wufoo.com/
Zoho Creator, http://creator.zoho.com
An, Y., Borgida, A., Miller, R.J., Mylopoulos, J.: A Semantic Approach to Discovering Schema Mapping Expressions. In: ICDE 2007, pp. 206–215 (2007)
Aumueller, D., Do, H.-H., Massmann, S., Rahm, E.: Schema and ontology matching with coma++. In: SIGMOD (2005)
Batini, C., Lenzerini, M., Navathe, S.B.: A comparative analysis of Methodologies for database schema integration. ACM Computing Surveys 18(4), 323–364 (1986)
Bellahsene, Z., Bonifati, A., Rahm, E. (eds.): Schema Matching and Mapping (Data-Centric Systems and Applications). Springer, Heidelberg (2011)
Buneman, P., Davidson, S.B., Kosky, A.: Theoretical aspects of schema merging. In: Pirotte, A., Delobel, C., Gottlob, G. (eds.) EDBT 1992. LNCS, vol. 580, pp. 152–167. Springer, Heidelberg (1992)
Chiticariu, L., Hernández, M.A., Kolaitis, P.G., Popa, L.: Semi-automatic schema integration in clio. In: VLDB, pp. 1326–1329 (2007)
Choobineh, J., Mannino, M.V., Tseng, V.P.: A form-based approach for database analysis and design. Commun. ACM 35(2), 108–120 (1992)
Dragut, E.C., Kabisch, T., Yu, C.T., Leser, U.: A hierarchical approach to model web query interfaces for web source integration. PVLDB 2(1), 325–336 (2009)
Jagadish, H.V., Chapman, A., Elkiss, A., Jayapandian, M., Li, Y., Nandi, A., Yu, C.: Making database systems usable. In: SIGMOD 2007, pp. 13–24. ACM, New York (2007)
Jayapandian, M., Jagadish, H.V.: Automated creation of a forms-based database query interface. Proc. VLDB Endow. 1(1), 695–709 (2008)
Khare, R., An, Y.: An empirical study on using hidden markov model for search interface segmentation. In: Proceedings of 18th ACM Conference on Information and Knowledge Management (CIKM), pp. 17–26 (2009)
Khare, R., An, Y., Hu, X., Song, I.-Y.: Can clinician create high-quality databases? a study on a flexible electronic health record (fehr) system. In: The Proceedings of the 1st ACM Health Informatics Symposium (IHI 2010), Washington, DC, USA (2010)
Khare, R., An, Y., Song, I.-Y.: Understanding search interfaces: A survey. SIGMOD Record 39(1), 33–40 (2010)
Kowalczykowski, K., Ong, K.W., Zhao, K.K., Deutsch, A., Papakonstantinou, Y., Petropoulos, M.: Do-it-yourself custom forms-driven workflow applications. In: CIDR 2009 (2009)
Luković, I., Mogin, P., Pavićević, J., Ristić, S.: An approach to developing complex database schemas using form types. Softw. Pract. Exper. 37(15), 1621–1656 (2007)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. In: VLDB 2001, pp. 49–58 (2001)
Miller, R.J., Haas, L.M., Hernandez, M.A.: Schema Mapping as Query Discovery. In: VLDB, pp. 77–88 (2000)
Pavicevic, J., Lukovic, I., Mogin, P., Govedarica, M.: Information system design and prototyping using form types. In: ICSOFT (2), pp. 157–160 (2006)
Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M.A., Fagin, R.: Translating web data. In: VLDB, pp. 598–609 (2002)
Pottinger, R., Bernstein, P.A.: Merging models based on given correspondences. In: VLDB, pp. 826–873 (2003)
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 257–286 (1989)
Rahm, E., Bernstein, P.: An on-line bibliography on schema evolution. SIGMOD Record 35(4), 30–31 (2006)
Wu, W., Yu, C., Doan, A., Meng, W.: An interactive clustering-based approach to integrating source query interfaces on the deep web. In: SIGMOD 2004, pp. 95–106. ACM, New York (2004)
Yang, F., Gupta, N., Botev, C., Churchill, E.F., Levchenko, G., Shanmugasundaram, J.: Wysiwyg development of data driven web applications. Proc. VLDB Endow. 1(1), 163–175 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
An, Y., Khare, R., Song, IY., Hu, X. (2011). Automatically Mapping and Integrating Multiple Data Entry Forms into a Database. In: Jeusfeld, M., Delcambre, L., Ling, TW. (eds) Conceptual Modeling – ER 2011. ER 2011. Lecture Notes in Computer Science, vol 6998. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24606-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-24606-7_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24605-0
Online ISBN: 978-3-642-24606-7
eBook Packages: Computer ScienceComputer Science (R0)