Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3568444.3570588acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmumConference Proceedingsconference-collections

Supporting Self-development of Speech Delivery for Education Professionals

Published: 29 December 2022 Publication History


A rapid shift towards online education witnessed in recent years, re-shaped the necessary skill set of a modern teacher. Delivery of classes in remote and hybrid format became an integral part of education. However, providing feedback on lecturing technique at scale still remains a challenge. We present the design and implementation of Speecoach – an online tool for lecturers that analyses the audio recordings of lectures and provides feedback concerning their style of speech delivery. Our system analyses the pitch, speech rate and volume dynamics of a recording and compares it against the reference model based on a data set of professional-grade talks. Our tool facilitates teachers’ self-development by providing contextualised feedback on the parameters of their speech and comparing it to different forms of public addresses. Our work contributes a prototype tool facilitating public speaking and envisions future developments and studies in life-long learning of education professionals.

Supplementary Material

Additional visualisations and source code of the prototype software. (Supporting_visuals.pdf)
Additional visualisations and source code of the prototype software. (Supporting_sourcecode.zip)


2021. MIT OpenCourseWare. https://www.youtube.com/c/mitocw
2021. TEDxTalks. https://www.youtube.com/user/TEDxTalks/featured
2021. Toastmasters International. https://www.youtube.com/c/toastmasters
C. Anderson. 2016. TED Talks: The Official TED Guide to Public Speaking. Headline Publishing Group. https://books.google.pl/books?id=fhEgjgEACAAJ
Adrian Andrzejewski, Kordian Kręcisz, Mariusz Matusiak, Andrzej Romanowski, and Laurent Babout. 2018. Brainstorming Sessions – Towards Improving Effectiveness and Assessment of Ideas Generation. In Information Systems Architecture and Technology: Proceedings of 38th – ISAT 2017. Springer International Publishing, Cham, 128–137.
Ligia Batrinca, Giota Stratou, Ari Shapiro, LP Morency, and Stefan Scherer. 2013. Cicero-Towards a Multimodal Virtual Audience Platform for Public Speaking Training. https://doi.org/10.1007/978-3-642-40415-3_10
Chen Chen, Paweł W Woźniak, Andrzej Romanowski, Mohammad Obaid, Tomasz Jaworski, Jacek Kucharski, Krzysztof Grudzień, Shengdong Zhao, and Morten Fjeld. 2016. Using crowdsourcing for scientific analysis of industrial tomographic images. ACM Transactions on Intelligent Systems and Technology (TIST) 7, 4(2016), 52.
Keith Curtis, Gareth J. F. Jones, and Nick Campbell. 2016. Speaker Impact on Audience Comprehension for Academic Presentations(ICMI ’16). Association for Computing Machinery, New York, NY, USA, 129–136. https://doi.org/10.1145/2993148.2993194
William L. Goffe and David Kauper. 2014. A Survey of Principles Instructors: Why Lecture Prevails. The Journal of Economic Education 45, 4 (2014), 360–375. https://doi.org/10.1080/00220485.2014.946547 arXiv:https://doi.org/10.1080/00220485.2014.946547
Ben Gold, Nelson Morgan, and Dan Ellis. 2011. Speech and audio signal processing: processing and perception of speech and music. John Wiley & Sons.
Sofia Gustafson-Capkova and Beáta Megyesi. 2001. A comparative study of pauses in dialogues and read speech. In Seventh European Conference on Speech Communication and Technology.
Abdolmajid Hayati 2010. The effect of speech rate on listening comprehension of EFL learners. Creative Education 1, 02 (2010), 107.
Rebecca Hincks. 2004. Processing the Prosody of Oral Presentations. (06 2004).
Robin Kay, Thom Macdonald, and Maurice DiGiuseppe. 2019. A Comparison of Lecture-based, Active, and Flipped Classroom Teaching Approaches in Higher Education. Journal of Computing in Higher Education 31 (12 2019), 449–471. https://doi.org/10.1007/s12528-018-9197-x
Glenn E Knowlton and Kevin T Larkin. 2006. The influence of voice volume, pitch, and speech rate on progressive relaxation training: application of methods from speech pathology and audiology. Applied psychophysiology and biofeedback 31, 2 (2006), 173–185.
Kazutaka Kurihara, Masataka Goto, Jun Ogata, Yosuke Matsusaka, and Takeo Igarashi. 2007. Presentation sensei: a presentation training system using speech and image processing. 358–365. https://doi.org/10.1145/1322192.1322256
Gonzalo Luzardo, Bruno Guamán, Katherine Chiluiza, Jaime Castells, and Xavier Ochoa. 2014. Estimation of Presentations Skills Based on Slides and Audio Features. https://doi.org/10.1145/2666633.2666639
Scott Norcross, Sachin Nanda, and Zack Cohen. 2016. ITU-R BS. 1770 based loudness for immersive audio. In Audio Engineering Society Convention 140. Audio Engineering Society.
Xavier Ochoa and Federico Dominguez. 2020. Controlled evaluation of a multimodal system to improve oral presentation skills in a real learning setting. British Journal of Educational Technology 51, 5 (2020), 1615–1630. https://doi.org/10.1111/bjet.12987 arXiv:https://bera-journals.onlinelibrary.wiley.com/doi/pdf/10.1111/bjet.12987
Xavier Ochoa, Federico Dominguez, Bruno Guamán, Ricardo Maya, Gabriel Falcones, and Jaime Castells. 2018. The RAP system: automatic feedback of oral presentation skills using multimodal analysis and low-cost sensors. 360–364. https://doi.org/10.1145/3170358.3170406
Richard K Olson and Barbara W Wise. 1992. Reading on the computer with orthographic and speech feedback. Reading and Writing 4, 2 (1992), 107–144.
Tzong-Yang Pan and Cathy Owen. 2014. The Quality Lecture: How Do We Rate?INTERNATIONAL JOURNAL OF RESEARCH IN EDUCATION METHODOLOGY 5 (12 2014), 710–720. https://doi.org/10.24297/ijrem.v5i3.3896
Laura Perry and Robert Mowbray. 2013. Improving lecture quality through training in public speaking. Innovations in Education and Teaching International 52 (09 2013). https://doi.org/10.1080/14703297.2013.849205
Emma Rodero. 2012. A comparative analysis of speech rate and perception in radio bulletins. Text & Talk 32, 3 (2012), 391–411.
Andrzej Romanowski. 2019. Big Data-Driven Contextual Processing Methods for Electrical Capacitance Tomography. IEEE Transactions on Industrial Informatics 15, 3 (2019), 1609–1618. https://doi.org/10.1109/TII.2018.2855200
Andrew Rosenberg and Julia Hirschberg. 2005. Acoustic/prosodic and lexical correlates of charismatic speech. 9th European Conference on Speech Communication and Technology, 513–516.
Henk Schmidt, Stephanie Wagener, Guus Smeets, Lianne Keemink, and Ht Molen. 2015. On the Use and Misuse of Lectures in Higher Education. Health Professions Education 1 (12 2015), 12–18. https://doi.org/10.1016/j.hpe.2015.11.010
Berrak Sisman, Junichi Yamagishi, Simon King, and Haizhou Li. 2020. An overview of voice conversion and its challenges: From statistical modeling to deep learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2020), 132–157.
Eva Strangert and Joakim Gustafson. 2008. What makes a good speaker? Subject ratings, acoustic measurements and perceptual evaluations. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 1688–1691.
David Talkin and W Bastiaan Kleijn. 1995. A robust algorithm for pitch tracking (RAPT). Speech coding and synthesis 495 (1995), 518.
H. Trinh, R. Asadi, D. Edge, and T. Bickmore. 2017. RoboCOP: A Robotic Coach for Oral Presentations. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 2, Article 27 (June 2017), 24 pages. https://doi.org/10.1145/3090092
Yunefit Ulfa, Yukari Igarashi, Kaori Takahata, Eri Shishido, and Shigeko Horiuchi. 2021. A comparison of team-based learning and lecture-based learning on clinical reasoning and classroom engagement: a cluster randomized controlled trial. BMC Medical Education 21, 1 (2021), 1–11.
Europian Broadcasting Union. 2020. R 128 - Loudness normalisation and permitted maximum level of audio signals. https://tech.ebu.ch/docs/r/r128.pdf
Xin Xie, Keng Siau, and Fiona Fui-Hoon Nah. 2020. COVID-19 pandemic – online education in the new normal and the next normal. Journal of Information Technology Case and Application Research 22, 3(2020), 175–187. https://doi.org/10.1080/15228053.2020.1824884 arXiv:https://doi.org/10.1080/15228053.2020.182488

Cited By

View all

Index Terms

  1. Supporting Self-development of Speech Delivery for Education Professionals



    Information & Contributors


    Published In

    cover image ACM Other conferences
    MUM '22: Proceedings of the 21st International Conference on Mobile and Ubiquitous Multimedia
    November 2022
    315 pages
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.


    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 December 2022

    Check for updates

    Author Tags

    1. online education
    2. presentation skills
    3. remote learning
    4. self-development
    5. speech


    • Poster
    • Research
    • Refereed limited


    MUM 2022

    Acceptance Rates

    Overall Acceptance Rate 190 of 465 submissions, 41%


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • 0
      Total Citations
    • 55
      Total Downloads
    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Jan 2025

    Other Metrics


    Cited By

    View all

    View Options

    Login options

    View options


    View or Download as a PDF file.



    View online with eReader.


    HTML Format

    View this article in HTML Format.

    HTML Format







    Share this Publication link

    Share on social media