Abstract
Spoken-word audio collections cover many domains, including radio and television broadcasts, oral narratives, governmental proceedings, lectures, and telephone conversations. The collection, access, and preservation of such data is stimulated by political, economic, cultural, and educational needs. This paper outlines the major issues in the field, reviews the current state of technology, examines the rapidly changing policy issues relating to privacy and copyright, and presents issues relating to the collection and preservation of spoken audio content .
Similar content being viewed by others
References
IASA Technical Committee(1997) The safeguarding of the audio heritage: ethics, principles and preservation strategy, February 1997. IASA-TC 03 Version 1
(1999) Risk management suggestions. In: Multimedia Web Strategist 5
Appelt D, Martin D (1999) Named entity recognition in speech: approach and results using the TextPro system. In: Proc DARPA workshop on broadcast news, pp 51–54
Arons B (1997) SpeechSkimmer: a systen for interactively skimming recorded speech. ACM Trans Comput Hum Interact 4:3–38
Bird S, Harrington J (eds) (2001) Special issue on speech annotation and corpus tools. Speech Commun 33(1–2):1–174
Bird S, Simons G (2003) Seven dimensions of portability for language documentation and description. Language 79:557–582
Campbell JP Jr (1997) Speaker recognition: a tutorial. Proc IEEE 85:1437–1462
Chen S, Gopalakrishnan PS (1998) Clustering via the Bayesian Information Criterion with applications in speech recognition. In: Proceedings of IEEE ICASSP-98, pp 645–648
Christensen CM (1997) The innovator’s dilemma. Harvard Business School Press, Boston
Electronic Privacy Information Center (EPIC) and Privacy International (2002) Privacy and Human Rights 2002, Washington, DC
Garofolo JS, Auzanne CGP, Voorhees EM (2000) The TREC spoken document retrieval track: a success story. In: Proc. RIAO 2000
Gauvain J-L, Lamel L (2000) Large-vocabulary continuous speech recognition: advances and applications. Proc IEEE 88:1181–1200
Glover R, Worlton A (2002) Trans-national employers must harmonize conflicting privacy rules. In: Metropolitan Corporate Counsel, Mid-atlantic edn. Metropolitan Corporate Counsel, Mountainside, NJ, p 20
Godsill SJ, Rayner PJW (1995) A Bayesian approach to the restoration of degraded audio signals. IEEE Trans Speech Audio Process 3:267–278
Gotoh Y, Renals S (2000) Information extraction from broadcast news. Philos Trans R Soc Lond Ser A 358:1295–1310
Hori C, Furui S, Malkin R, Yu H, Waibel A (2003) A statistical approach for automatic speech summarization. EURASIP J Appl Signal Process 2:128–139
Lagoze C, Van de Sompel H (2001) The Open Archives Initiative: building a low-barrier interoperability framework. In: Proceedings of the 1st ACM/IEEE-CS joint conference on digital libraries, pp 54–62
Ling T (2002) Why the archive introduced digitisation on demand. RLG Diginews, 6(4) http://www.rlg.org/preserv/diginews/diginews6-4.html#feature1
Lippmann RP (1997) Speech recognition by machines and humans. Speech Commun 22(1):1–15
Litman J (2001) Digital Copyright. Prometheus Books, Amherst, NY, p 84
Logan B, Robinson T (2001) Adaptive model-based speech enhancement. Speech Commun 34:351–368
Makhoul J, Kubala F, Leek T, Liu D, Nguyen L, Schwartz R, Srivastava A (2000) Speech and language technologies for audio indexing and retrieval. Proc IEEE 88:1338–1353
Maybury M (ed) (2000) Special issue on news on demand. Commun ACM 43(2):32–34
Oard DW (1997) Serving users in many languages: cross-language information retrieval. D-Lib Mag http://www.dlib.org/dlib/december97/oard/12oard.html
Oard DW (2000) User interface design for speech-based retrieval. Bull Am Soc Inf Sci 26(5):20–22
Rigoll G (2001) The ALERT system: advanced broadcast speech recognition technology for selective dissemination of multimedia information. In: IEEE workshop on automatic speech recognition and understanding, pp 301–306
Rothenberg LE (2000) Rethinking privacy: peeping toms, video voyeurs and failure of the criminal law to recognize a reasonable expectiation of privacy in the public space. Am University Law Rev 49:1127
Simons G, Bird S (2003) Building an Open Language Archives Community on the OAI foundation. Library Hi Tech 21:210–218
Sundara Rajan MT (2002) Moral rights and copyright harmonization: prospects for an “international moral right”. In: 17th BILETA annual conference, April 2002
Wactlar HD, Kanade T, Smith MA, Stevens SM (1996) Intelligent access to digital video: informedia project. IEEE Comput 29(5):46–53
Wahlster W (ed) (2000) Verbmobil: foundations of speech-to-speech translation. Springer, Berlin Heidelberg New York
Wayne C (2000) Multilingual topic detection and tracking: Successful research enabled by corpora and evaluation. In: Language resources and evaluation conference (LREC), pp 1487–1494
Whittaker S, Hirschberg J, Choi J, Hindle D, Pereira F, Singhal A (1999) SCAN: designing and evaluating user interfaces to support retrieval from speech archives. In: Proceedings of ACM SIGIR-99 conference on research and development in information retrieval, pp 26–33
World Intellectual Property Organization (WIPO) (1979) Berne Convention for the Protection of Literary and Artistic Works. http://www.wipo.int/treaties/ip/berne/
Young S (1996) A review of large-vocabulary continuous-speech recognition. IEEE Signal Process Mag 13(5):45–57
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Goldman, J., Renals, S., Bird, S. et al. Accessing the spoken word. Int J Digit Libr 5, 287–298 (2005). https://doi.org/10.1007/s00799-004-0101-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-004-0101-0