Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1931390.1931399acmotherconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Recent advances in automatic speech summarization

Published: 30 May 2007 Publication History

Abstract

Speech summarization technology, which extracts important information and removes irrelevant information from speech, is expected to play an important role in building speech archives and improving the efficiency of spoken document retrieval. However, speech summarization has a number of significant challenges that distinguish it from general text summarization. Fundamental problems with speech summarization include speech recognition errors, disfluencies, and difficulties of sentence segmentation. Typical speech summarization systems consist of speech recognition, sentence segmentation, sentence extraction, and sentence compaction components. Most research up to now has focused on sentence extraction, using LSA (Latent Semantic Analysis), MMR (Maximal Marginal Relevance), or feature-based approaches, among which no decisive method has yet been found. Proper sentence segmentation is also essential to achieve good summarization performance. How to objectively evaluate speech summarization results is also an important issue. Several measures, including families of SumACCY and ROUGE measures, have been proposed, and correlation analyses between subjective and objective evaluation scores have been performed. Although these measures are useful for ranking various summarization methods, they do not correlate well with human evaluations, especially when spontaneous speech is targeted.

References

[1]
Alexandersson, J. & Poller, P. (1998). Towards multilingual protocol generation for spontaneous dialogues. In Proc. INLG-98, Niagara-on-the-lake, Canada.
[2]
Carbonell, J. & Goldstein, J. (1998). The use of MMR, density-based reranking for reordering documents and producing summaries. In Proc. ACM SIGIR (pp. 335--336).
[3]
Christensen, H., Gotoh, Y., Kolluru, B. & Renals, S. (2003). Are extractive text summarization techniques portable to broadcast news. In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (pp. 489--494). St. Thomas.
[4]
Christensen, H., Kolluru, B., Gotoh, Y. & Renals, S. (2004). From text summarization to style-specific summarization for broadcast news. In Proc. ECIR-2004.
[5]
Christensen, H., Kolluru, B., Gotoh, Y. & Renals, S. (2005). Maximum entropy segmentation of broadcast news. In Proc. ICASSP 2005 (pp. I-1029--1032). Philadelphia, PA.
[6]
Furui, S., Kikuchi, T., Shinnaka, Y. & Hori C. (2004). Speech-to-text and speech-to-speech summarization of spontaneous speech. In IEEE Trans. Speech & Audio Proc. Vol. 12. No. 4. (pp. 401--408).
[7]
Garofolo, J. S., Voorhees, E. M., Auzanne, C. G. P. & Stanford, V. M. (1999). Spoken document retrieval: 1998 evaluation and investigation of new metrics. In Proc. ESCA Workshop on Accessing Information in Spoken Audio (pp. 1--7). Cambridge.
[8]
Gong, Y. & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. In Proc. ACM Special Interest Group on Information Retrieval, (pp. 19--25). New Orleans.
[9]
Hearst, M. A. (1997). TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics. Vol. 23. No. 1. (pp. 33--64).
[10]
Hirohata, M., Shinnaka, Y. & Furui, S. (2003). A study on important sentence extraction methods using SVD for automatic speech summarization. In Proc. 2003 Autumn Meeting of the Acoustical Society of Japan, Vol. 1. (pp. 93--94). (in Japanese).
[11]
Hirohata, M., Shinnaka, Y., Iwano, K. & Furui, S. (2006). Sentence-extractive automatic speech summarization and evaluation techniques. Speech Communication. Vol. 48. No. 9. (pp. 1151--1161).
[12]
Hori, C. & Furui, S. (2001). Advances in automatic speech summarization. In Proc. Eurospeech 2001. (pp. 1771--1774).
[13]
Hori, C., Hori, T. & Furui, S. (2003). Evaluation methods for automatic speech summarization. In Proc. Eurospeech 2003. (pp. 2825--2828).
[14]
Hori, C., Hirao, T. & Isozaki, H. (2004). Evaluation measures considering sentence concatenation for automatic summarization by sentence or word extraction. Proc. Association for Computational Linguistics. (pp. 82--88). Barcelona.
[15]
Hori, T., Hori, C. & Minami, Y. (2003). Speech summarization using weighted finite-state transducers. In Proc. Eurospeech 2003. (pp. 2817--2820).
[16]
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A. & Wooters, C. (2003). The ICSI meeting corpus. In Proc. ICASSP 2003.
[17]
Kikuchi. T., Furui, S. and Hori, C. (2003). Two-stage automatic speech summarization by sentence extraction and compaction. In Proc. ISCA-IEEE Workshop on Spontaneous Speech Processing and Recognition. TAP10. Tokyo.
[18]
Kolluru, B., Christensen, H. and Gotoh, Y. (2005). Multi-stage compaction approach to broadcast news summarization. In Proc. Interspeech 2005. (pp. 69--72).
[19]
Kolluru, B., Christensen, H., Gotoh, Y. & Renals, S. (2003). Exploring the style-technique interaction in extractive summarization of broadcast news. In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding. (pp. 495--500). St. Thomas.
[20]
Kong, S.-Y. & Lee, L.-S. (2006). Improved spoken document summarization using probabilistic latent semantic analysis (PLSA). In Proc. ICASSP 2006. (pp. I-941--944). Toulouse.
[21]
Koumpis, K. & Renals, S. (2000). Transcription and summarization of voicemail speech. In Proc. ICSLP 2000. (pp. 688--691). Beijing.
[22]
Kupiec, J., Pederson, J. & Chen, F. (1995). A trainable document summarizer. In Proc. ACM SIGIR 1995. (pp. 68--73).
[23]
Lin, C.-Y. (2004). Looking for a few good metrics: ROUGE and its evaluation. In Proc. Working Notes of NTCIR-4. Vol. Supl. 2. (pp. 1--8).
[24]
Mani, I. & Maybury, M. T. (Ed.). (1999). Advances in automatic text summarization, MIT Press. Cambridge. MA.
[25]
Mrozinski, J., Whittaker, E., Chatain, P. & Furui, S. (2006). Automatic sentence segmentation of speech for automatic summarization. In Proc. ICASSP 2006. (pp. I-981--984). Toulouse.
[26]
Murray, G., Renals, S., Carletta, J. & Moore, J. (2005a). Evaluating automatic summaries of meeting recordings. In Proc. ACL 2005 Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization (MTSE). (pp. 33--40). Ann Arbor.
[27]
Murray, G., Renals, S. & Carletta, J. (2005b). Extractive summarization of meeting recordings. In Proc. Interspeech 2005, (pp. 593--596). Lisbon.
[28]
Sameer, M., & Hirschberg, J. (2005). Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization. In Proc. Interspeech 2005, (pp. 621--624). Lisbon.
[29]
Shinozaki, T. & Furui, S. (2004). Spontaneous speech recognition using a massively parallel decoder. In Proc. ICSLP 2004. Vol. 3. (pp. 1705--1708). Jeju Island.
[30]
Shriberg, E., Stolcke, A., Hakkani-Tur, D & Tur, G. (2000). Prosody-based automatic segmentation of speech into sentences and topics. Speech Communication. Vol. 32. No. 1--2. (pp. 127--154).
[31]
Steinberger, J & Jezek, K. (2004). Text summarization and singular value decomposition. T. Yakhno (Ed.) Lecture Notes in Computer Science 3261. Third International Conference on Advances in Information Systems (ADVIS 2004). (pp. 245--254). Springer-Verlag. Berlin and Heidelberg.
[32]
Valenza, R., Robinson, T., Hickey, M. & Tucker, R. (1999). Summarization of spoken audio through information extraction. In Proc. ESCA Workshop on Accessing Information in Spoken Audio. (pp. 111--116). Cambridge.
[33]
Zechner, K. (2002). Automatic summarization of open-domain multiparty dialogues in diverse genres. Computational Linguistics. 28. 4. (pp. 447--485).
[34]
Zechner, K. (2003), Spoken language condensation in the 21st century. In Proc. Eurospeech 2003. (pp. 1989--1992). Geneva.
[35]
Zechner, K. & Waibel, A. (2000). Minimizing word error rate in textual summaries of spoken language. In Proc. NAACL 2000. Seattle.

Cited By

View all
  • (2008)Mixed-source multi-document speech-to-text summarizationProceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization10.5555/1613172.1613181(33-40)Online publication date: 23-Aug-2008

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
RIAO '07: Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
May 2007
817 pages

Sponsors

  • CID (France): Le Centre de Hautes Etudes Internationales D'Informatique Documentaire

In-Cooperation

Publisher

LE CENTRE DE HAUTES ETUDES INTERNATIONALES D'INFORMATIQUE DOCUMENTAIRE

Paris, France

Publication History

Published: 30 May 2007

Check for updates

Qualifiers

  • Research-article

Conference

RIAO07
Sponsor:
  • CID (France)
RIAO07: Large Scale Semantic Access to Content
May 30 - June 1, 2007
Pennsylvania, Pittsburgh

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2008)Mixed-source multi-document speech-to-text summarizationProceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization10.5555/1613172.1613181(33-40)Online publication date: 23-Aug-2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media