Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Functionalities for automatic metadata generation applications: a survey of metadata experts' opinions

Published: 01 January 2006 Publication History

Abstract

This paper reports on the automatic metadata generation applications (AMeGA) project's metadata expert survey. Automatic metadata generation research is reviewed and the study's methods, key findings and conclusions are presented. Participants anticipate greater accuracy with automatic techniques for technical metadata (e.g., ID, language, and format metadata) compared to metadata requiring intellectual discretion (e.g., subject and description metadata). Support for implementing automatic techniques paralleled anticipated accuracy results. Metadata experts are in favour of using automatic techniques, although they are generally not in favour of eliminating human evaluation or production for the more intellectually demanding metadata. Results are incorporated into Version 1.0 of the Recommended Functionalities for automatic metadata generation applications (Appendix A).

References

[1]
Anderson, J.D. and Perez-Carball, J. (2001) 'The nature of indexing: how humans and machines analyze messages and texts for retrieval - part I: research, and the nature of human indexing', Information Processing and Management, Vol. 37, No. 2, pp. 231-254.
[2]
Bruce, T.R. and Hillmann, D.I. (2004) 'The continuum of metadata quality: defining, expressing, exploiting', in Hillmann, D.I. and Westbrooks, E.L. (Eds.): Metadata in Practice, ALA, Chicago, IL, pp. 238-256.
[3]
CONDOC (1981) 'Revisiting CONDOC: a new look at the online catalog sponsored by the ala catalog use committee', FTP Request: CONDOC Report, Available at: [email protected].
[4]
Crystal, A. and Greenberg, J. (2005) 'Usability of a Metadata Creation Application for Resource Authors', Library and Information Science Research, Vol. 27, No. 2, pp. 177-189.
[5]
Cutter, C.A. (1904) Rules for a Dictionary Catalog, 4th ed., Government Printing Office, Washington, DC.
[6]
Dakshinamurti, G. (1985) 'Automation's effect on library personnel', Canadian Library Journal, Vol. 42, pp. 343-351.
[7]
DCMI metadata terms (2004) Retrieved January 5, 2005, from http://dublincore.org/documents/2004/09/20/dcmi-terms/.
[8]
Greenberg, J. (2001) 'A quantitative categorical analysis of metadata elements in image applicable metadata schemas', Journal of the American Society for Information Science and Technology, Vol. 52, No. 11, pp. 917-914.
[9]
Greenberg, J. (2003) 'Metadata and the World Wide Web', in Drake, M.A. (Ed.): Encyclopedia of Library and Information Science, 2nd ed., Marcel Dekker Inc., New York, pp. 1876-1888.
[10]
Greenberg, J. (2004a) Definitions of Terms Used in the AMeGA Survey, Retrieved January 5, 2005, from http://ils.unc.edu/mrc/amega_survey_defs.htm.
[11]
Greenberg, J. (2004b) 'Metadata extraction and harvesting: a comparison of two automatic metadata generation applications', Journal of Internet Cataloging, Vol. 6, No. 4, pp. 59-82.
[12]
Greenberg, J., Crystal, A., Robertson, W.D. and Leadem, E. (2003) 'Iterative design of metadata creation tools for resource authors', in Sutton, S. Greenberg, J. and Tennis, J. (Eds.): Proceedings of the 2003 Dublin Core Conference: Supporting Communities of Discourse and Practice - Metadata Research and Applications, Seattle, Washington, September 28 - October 2, 2003, Retrieved January 5, 2005, from http:// www.siderean.com/dc2003/202_Paper82-color-NEW.pdf.
[13]
Gunter, B., Nicholas, D., Huntington, P. and Williams, P. (2002) 'Online versus offline research: implications for evaluating digital media', Aslib Proceedings, Vol. 45, No. 4, pp. 229-239.
[14]
Han, H.C., Giles, L., Manavoglu, E., Zha, H., Zhang, Z. and Fox, E.A. (2003) 'Automatic document metadata extraction using support vector machines', Proceedings of the Third ACM/IEEE-CS Joint Conference on Digital Libraries, ACM Press, New York, pp. 37-48.
[15]
Hatala, M. and Forth, S. (2003) 'System for computer-aided metadata creation', The Twelfth International World Wide Web Conference (WWW2003), May 20-24, Budapest.
[16]
Hayslett, M.M. and Wildemuth, B.W. (2004) 'Pixels or pencils? the relative effectiveness of Web-based versus paper surveys', Library and Information Science Research, Vol. 26, No. 1, pp. 73-93.
[17]
Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K. and Yee, K.P. (2002) 'Finding the flow in website search', Communications of the ACM, Vol. 45, No. 9, pp. 42-49.
[18]
Heery, R. and Wagner, H. (2002) 'a metadata registry for the semantic web', D-Lib Magazine, Vol. 8, No. 5, Retrieved January 5, 2005, from http://www.dlib.org/dlib/may02/wagner/05wagner.html.
[19]
Heyman, B.L. (1981) 'In line to get on line: A background report on CONDOC (The Consortium to Develop an On-line Catalog)', Colorado Libraries, Vol. 7, No. 4, pp. 10-13.
[20]
International Federation of Library Associations and Institutions (1998) Functional Requirements for Bibliographic Records: Final Report, Retrieved January 5, 2005, from http://www.ifla.org/VII/s13/frbr/frbr.pdf.
[21]
Johnson, F. (1995) 'Automatic abstracting research', Library Review, Vol. 44, No. 8, pp. 28-36.
[22]
Jones, S. and Paynter, G.W. (2002) 'Automatic extraction of document keyphrases for use in digital libraries: evaluation and applications', Journal of the American Society for Information Science and Technology, Vol. 53, No. 8, pp. 653-657.
[23]
Lan, W.C. (2002) From Document Clues to Descriptive Metadata: Document Characteristics Used by Graduate Students in Judging the Usefulness of Web Documents, Doctoral dissertation, University of North Carolina at Chapel Hill.
[24]
Liddy, E.D., Allen, E., Harwell, S., Corieri, S., Yilmazel, O., Ozgencil, N.E., Diekema, A., McCracken, N.J., Silverstein, J. and Sutton, S.A. (2002) 'Automatic metadata generation and evaluation', Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August 11-15, Tampere, Finland, ACM Press, New York, pp. 401-402.
[25]
Losee, R. (2003) 'Adaptive organization of tabular data for display', Journal of Digital Information, Vol. 4, No. 1, Retrieved January 5, 2005, from http://jodi.ecs.soton.ac.uk/Articles/v04/i01/Losee/.
[26]
Lutes, B. (1999) Web Thesaurus Compendium, Retrieved January 5, 2005 from http://www.ipsi.fraunhofer.de/~lutes/thesoecd.html.
[27]
Metta Matters (2003) DCANZ and National Library of Australia and Dublin Core ANZ. http://www.nla.gov.au/meta/.
[28]
Nadkarni, P., Chen, R. and Brandt, C. (2001) 'UMLS concept indexing for production databases: a feasibility study', Journal of the American Medical Information Association, Vol. 8, No. 1, pp. 80-91.
[29]
National Information Standards Organization (2002) Data Dictionary: Technical Metadata for Digital Still Images, Proposed NISO standard Z39.87. Retrieved January 5, 2005, from http://www.niso.org/standards /resources/Z39_87_trial_use.pdf.
[30]
Patton, M., Reynolds, D., Choudhury, G. S. and DiLauro, T. (2004) 'Toward a metadata generation framework: a case study at the John Hopkins university', D-Lib Magazine, Vol. 10, No. 11, Retrieved January 5, 2005, from http://www.dlib.org/dlib/november04/choudhury/11choudhur y.html.
[31]
Research Libraries Group (2003) Automatic Exposure: Capturing Technical for Digital Still Images, Retrieved January 5, 2005, from www.rlg.org/longterm/ae_whitepaper_2003.pdf.
[32]
Smiraglia, R.P. and Leazer, G.H. (1999) 'Derivative bibliographic control relationships: the word relationship in a global bibliographic database', Journal of the American Society for Information Science, Vol. 50, pp. 493-504.
[33]
Takasu, A. (2003) 'Bibliographic attribute extraction from erroneous references based on a statistical model', Proceedings of the Third ACM/IEEE-CS Joint Conference on Digital Libraries, ACM Press, New York, pp. 49-60.
[34]
Tillet, B. (1991) 'A taxonomy of bibliographic relationships', Library Resources and Technical Services, Vol. 35, No. 2, pp. 150-158.
[35]
Tillett, B.B. (1992) 'Bibliographic relationships: an empirical study of the LC machine-readable records', Library Resources and Technical Services, Vol. 36, No. 2, pp. 162-188.
[36]
Toms, E., Campbell, D. and Blades, R. (1999) 'Does genre define the shape of information: the role of form and function in user interaction with digital documents', Proceedings of the 62nd American Society for Information Science Annual Meeting, pp. 693-704.
[37]
van Duinen, R.S. (2004) New Discoveries in the André Savine Collection: Examining the Author-Generated Metadata Contained in the Bibliographic and Biographical Record of André Savine, Unpublished Master's Paper, School of Information and Library Science, University of North Carolina at Chapel Hill, Retrieved January 7, 2005, from http://hdl.handle.net/1901/121.
[38]
Vellucci, S.L. (1997) Bibliographic relationships, Paper presented at the International Conference on the Principles and Future Development of AACR, Toronto, Canada, Retrieved January 5, 2005 from http://collection.nlc-bnc.ca/100/ 200/300/jsc_aacr/bib_rel/r-bibrel.pdf.
[39]
Weinstein, P.C. (1998) 'Ontology-based metadata: transforming the MARC legacy', Proceedings of the 3rd ACM International Conference on Digital Libraries, June 23-26, Pittsburgh, PA, ACM Press, New York, pp. 254-263.
[40]
Weintraub, K.D. (1979) 'The essential of the bibliographic record as discovered by research', Library Resources and Technical Services, Vol. 23, No. 4, pp. 391-405.
[41]
Wilson, P. (1968) Two Kinds of Power: An Essay on Bibliographical Control, University of California Press, Berkeley, CA.
[42]
Woodley, M. (2000) 'Metadata standards crosswalks', in Baca, M. (Ed.): Introduction to metadata: Pathways to Digital Information, Getty Information Institute, Los Angles, CA, Retrieved January 5, 2005 from http://www.getty.edu/ research/conducting_research/standards/intrometadata/3_cros swalks/index.html.
[43]
Yilmazel, O., Finneran, C.M. and Liddy, E.D. (2004) 'Metaextract: an NLP system to automatically assign metadata', Proceedings of the 4th IEEE-CS Joint Conference on Digital Libraries, June 7-11, Tuscon, AZ, ACM Press, New York, pp. 241-242.
[44]
Zuboff, S. (1988) In the Age of the Smart Machine: The Future of Work and Power, Heinemann Professional, Oxford.

Cited By

View all
  • (2022)Testing the validity of Wikipedia categories for subject matter labelling of open-domain corpus dataJournal of Information Science10.1177/016555152097743848:5(686-700)Online publication date: 1-Oct-2022
  • (2016)Thesaurus structure, descriptive parameters, and scaleJournal of the Association for Information Science and Technology10.1002/asi.2354467:9(2156-2165)Online publication date: 1-Sep-2016
  • (2015)The activity of human metadata creation and the semantic webInternational Journal of Metadata, Semantics and Ontologies10.1504/IJMSO.2015.06827610:1(64-74)Online publication date: 1-Mar-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Metadata, Semantics and Ontologies
International Journal of Metadata, Semantics and Ontologies  Volume 1, Issue 1
January 2006
84 pages
ISSN:1744-2621
EISSN:1744-263X
Issue’s Table of Contents

Publisher

Inderscience Publishers

Geneva 15, Switzerland

Publication History

Published: 01 January 2006

Author Tags

  1. AMeGA project
  2. Dublin core
  3. automatic metadata generation
  4. human evaluation
  5. human production
  6. intellectual discretion
  7. metadata applications
  8. metadata experts
  9. metadata functionalities
  10. technical metadata

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Testing the validity of Wikipedia categories for subject matter labelling of open-domain corpus dataJournal of Information Science10.1177/016555152097743848:5(686-700)Online publication date: 1-Oct-2022
  • (2016)Thesaurus structure, descriptive parameters, and scaleJournal of the Association for Information Science and Technology10.1002/asi.2354467:9(2156-2165)Online publication date: 1-Sep-2016
  • (2015)The activity of human metadata creation and the semantic webInternational Journal of Metadata, Semantics and Ontologies10.1504/IJMSO.2015.06827610:1(64-74)Online publication date: 1-Mar-2015
  • (2012)Learners' perceptions on the importance of learning object metadata for relevance judgementInternational Journal of Metadata, Semantics and Ontologies10.1504/IJMSO.2012.0514907:4(283-294)Online publication date: 1-Jan-2012
  • (2012)Automatic geospatial metadata generation for earth science virtual data productsGeoinformatica10.1007/s10707-011-0123-x16:1(1-29)Online publication date: 1-Jan-2012
  • (2012)Content independent metadata production as a machine learning problemProceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition10.1007/978-3-642-31537-4_24(306-320)Online publication date: 13-Jul-2012
  • (2009)A framework for semantic annotation of geospatial data for agricultureInternational Journal of Metadata, Semantics and Ontologies10.1504/IJMSO.2009.0262604:1/2(118-132)Online publication date: 1-May-2009
  • (2008)Automatic metadata extraction from museum specimen labelsProceedings of the 2008 International Conference on Dublin Core and Metadata Applications10.5555/1503418.1503425(57-68)Online publication date: 22-Sep-2008
  • (2008)Encoding application profiles in a computational model of the crosswalkProceedings of the 2008 International Conference on Dublin Core and Metadata Applications10.5555/1503418.1503420(3-13)Online publication date: 22-Sep-2008
  • (2008)Automatic metadata generation by utilising pre-existing metadata of related resourcesInternational Journal of Metadata, Semantics and Ontologies10.1504/IJMSO.2008.0235763:4(292-304)Online publication date: 1-Mar-2008
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media