Abstract
Speech production studies and the knowledge they bring forward are of paramount importance to advance a wide range of areas including Phonetics, speech therapy, synthesis and interaction. Several technologies have been considered to study static and dynamic features of the articulators and speech motor control, such as electromagnetic articulography (EMA), real-time magnetic resonance (RTMRI) and ultrasound (US) imaging. While the latest advances in RTMRI provide a wealth of data of the full vocal tract, it is an expensive resource that requires specialized facilities. In this sense, US is a more affordable alternative for several contexts, enabling the acquisition of larger datasets, but demands adequate computational approaches for processing and analysis. While the literature is prolific in proposing methods for tongue segmentation from US, the noisy nature of the images and the specificities of the equipment often dictate a poor performance on novel datasets, a matter that needs to be assessed, before large data acquisition, to devise suitable acquisition and processing methods. In the scope of a line of research studying speech changes with age, this work describes the first results of an automatic tongue segmentation method from US, along with a characterization of the main challenges posed by the image data. Even though improvements are still needed, particularly to ensure temporal coherence, at its current stage, this method can already provide the required data for an automatic analysis of maximum tongue height, a relevant parameter to assess speech changes on vowel production.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akgul, Y.S., Stone, C., Maureen, K.: Automatic extraction and tracking of contours. Trans. Med. Imaging 18(10), 1035–1045 (1999)
Articulate Assistant Ltd.: Articulate Assistant Advanced Ultrasound Module User Manual (2014)
Articulate Instruments Ltd.: Ultrasound Stabilisation Headset Users Manual (2008)
Articulate Instruments Ltd.: SyncBrightUp Users Manual (2010)
Chen, Y., Lin, H.: Analysing tongue shape and movement in vowel production Using SS ANOVA in ultrasound imaging. In: ICPhS, pp. 124–127 (2011)
Csapó, T.G., Lulich, S.M.: Error analysis of extracted tongue contours from 2D ultrasound images. In: INTERSPEECH, pp. 2157–2161. ISCA, Dresden (2015)
Dokovova, M., Sabev, M., Scobbie, J.M., Lickley, R., Cowen, S.: Bulgarian vowel reduction in unstressed position: an ultrasound and acoustic investigation. In: 19th ICPhS, pp. 2720–2724 (2019)
Fabre, D., Hueber, T., Bocquelet, F., Badin, P.: Tongue tracking in ultrasound images using EigenTongue decomposition and artificial neural networks. In: INTERSPEECH, pp. 2410–2414. ISCA, Dresden (2015)
Fabre, D., Hueber, T., Girin, L., Alameda-Pineda, X., Badin, P.: Automatic animation of an articulatory tongue model from ultrasound images of the vocal tract. Speech Commun. 93, 63–75 (2017). https://doi.org/10.1016/j.specom.2017.08.002
Fasel, I., Berry, J.: Deep belief networks for real-time extraction of tongue contours from ultrasound during speech. In: International Conference on Pattern Recognition, pp. 1493–1496 (2010). https://doi.org/10.1109/ICPR.2010.369
Georgeton, L., Antolík, T.K., Fougeron, C.: Effect of domain initial strengthening on vowel height and backness contrasts in French: acoustic and ultrasound data. JSLHR 59(6), S1575–S1586 (2016)
Georgeton, L., Kocjančič Antolík, T., Fougeron, C.: Domain initial strengthening and height contrast in French: acoustic and ultrasound data. In: 10th ISSP, Cologne, pp. 142–145 (2014). https://halshs.archives-ouvertes.fr/halshs-01401388
Hall, K.C., Allen, C., Mcmullin, K., Letawsky, V., Turner, A.: Measuring magnitude of tongue movement for vowel height and backness. In: ICPhS (2015)
Hillenbrand, J., Getty, L.A., Clark, M., Wheeler, K.: Acoustic characteristics of American English vowels. J. Acoust. Soc. Am. 97(5), 3099–3111 (1995). http://ukpmc.ac.uk/abstract/MED/7759650
Jaumard-Hakoun, A., Xu, K., Roussel-ragot, P., Stone, M.L.: Tongue contour extraction from ultrasound images. In: 18th International Congress of Phonetic Sciences (ICPhS) (2015)
Karimi, E., Ménard, L., Laporte, C.: Fully-automated tongue detection in ultrasound images. Comput. Biol. Med. 111(103335), 1–13 (2019). https://doi.org/10.1016/j.compbiomed.2019.103335
Kirkham, S., Nance, C.: An acoustic-articulatory study of bilingual vowel production: advanced tongue root vowels in Twi and tense/lax vowels in Ghanaian English. J. Phon. 62, 65–81 (2017)
Kisler, T., Reichel, U., Schiel, F.: Multilingual processing of speech via web services. Comput. Speech Lang. 45, 326–347 (2017). https://doi.org/10.1016/j.csl.2017.01.005
Kovesi, P., et al.: Symmetry and asymmetry from local phase. In: Tenth Australian Joint Conference on Artificial Intelligence, vol. 190, pp. 2–4. Citeseer (1997)
Lancia, L., Rausch, P., Morris, J.S.: Automatic quantitative analysis of ultrasound tongue contours via wavelet-based functional mixed models. J. Acoust. Soc. Am. 137(2), EL178–EL183 (2015). https://doi.org/10.1121/1.4905881
Laporte, C., Ménard, L.: Robust tongue tracking in ultrasound images: a multi-hypothesis approach. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 633–637 (2015)
Laporte, C., Ménard, L.: Multi-hypothesis tracking of the tongue surface in ultrasound video recordings of normal and impaired speech. Med. Image Anal. 44, 98–114 (2018). https://doi.org/10.1016/j.media.2017.12.003
Lee, S.H., Yu, J.F., Hsieh, Y.H., Lee, G.S.: Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am. J. Speech Lang. Pathol. 24(4), 739–749 (2015)
Li, M., Kambhamettu, C., Stone, M.: Automatic contour tracking in ultrasound images. Clin. Linguist. Phon. 19(6–7), 545–554 (2005)
Morrison, G.S., Assmann, P.F.: Vowel Inherent Spectral Change: Modern Acoustics and Signal Processing. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-14209-3
Mozaffari, M.H., Lee, W.S.: Domain adaptation for ultrasound tongue contour extraction using transfer learning: a deep learning approach. J. Acoust. Soc. Am. 146(5), EL431–EL437 (2019). https://doi.org/10.1121/1.5133665
Mozaffari, M.H., Wen, S., Wang, N., Lee, W.: Real-time automatic tongue contour tracking in ultrasound video for guided pronunciation training. In: 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019), vol. 1, pp. 302–309 (2019). https://doi.org/10.5220/0007523503020309
Muldal, A.: Python-phasepack (2016). https://github.com/alimuldal/phasepack
Noble, A., et al.: Ultrasound image segmentation : a survey. IEEE Trans. Med. Imaging 25, 987–1010 (2006)
Song, J.Y.: The use of ultrasound in the study of articulatory properties of vowels in clear speech. Clin. Linguist. Phon. 31(5), 351–374 (2017). https://doi.org/10.1080/02699206.2016.1268207
Stone, M.: A guide to analysing tongue motion from ultrasound images. Clin. Linguist. Phon. 19(6–7), 455–501 (2005). https://doi.org/10.1080/02699200500113558
Tang, L., Bressmann, T., Hamarneh, G.: Tongue contour tracking in dynamic ultrasound via higher-order MRFs and efficient fusion moves. Med. Image Anal. 16(8), 1503–1520 (2012). https://doi.org/10.1016/j.media.2012.07.001
Tang, L., Hamarneh, G.: Graph-based tracking of the tongue contour in ultrasound sequences with adaptive temporal regularization. In: Computer Society Conference on Computer Vision and Pattern Recognition - Workshops (CVPRW 2010), pp. 154–161. IEEE (2010). https://doi.org/10.1109/CVPRW.2010.5543597
Unser, M., Stone, M.: Automated detection of the tongue surface in sequences of ultrasound images. J. Acoust. Soc. Am. 91(5), 3001–3007 (1992). https://doi.org/10.1121/1.402934
Wang, H., Wang, S., Denby, B., Dang, J.: Automatic tongue contour tracking in ultrasound sequences without manual initialization. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 200–203. IEEE (2015). https://doi.org/10.1109/APSIPA.2015.7415503
Wen, S.: Automatic tongue contour segmentation using deep learning. Master of Applied Science in Electrical and Computer Engineering, University of Otawa (2018)
Xu, K., et al.: Robust contour tracking in ultrasound tongue image sequences. Clin. Linguist. Phon. 30(3–5), 313–327 (2016). https://doi.org/10.3109/02699206.2015.1110714
Zhu, J., Styler, W., Calloway, I.: Automatic tongue contour extraction in ultrasound images with convolutional neural networks. J. Acoust. Soc. Am. 143(3), 1966 (2018). https://doi.org/10.1121/1.5036466
Zhu, J., Styler, W., Calloway, I.: A CNN-based tool for automatic tongue contour tracking in ultrasound images. eprint arXiv:1907.10210, pp. 1–6 (2019)
Acknowledgements
This research was financially supported by the projects VoxSenes (POCI-01-0145-FEDER-03082) and MEMNON (POCI-01-0145-FEDER-028976) – COMPETE2020 under POCI and FEDER, and by national funds (OE), through FCT/MCTES, SOCA – Smart Open Campus CENTRO-01-0145-FEDER-000010 (Portugal 2020 under POCI and FEDER) and by IEETA Research Unit funding (UIDB/00127/2020). Luciana Albuquerque’s work is funded by the FCT through grant SFRH/BD/115381/2016.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Barros, F., Valente, A.R., Albuquerque, L., Silva, S., Teixeira, A., Oliveira, C. (2020). Contributions to a Quantitative Unsupervised Processing and Analysis of Tongue in Ultrasound Images. In: Campilho, A., Karray, F., Wang, Z. (eds) Image Analysis and Recognition. ICIAR 2020. Lecture Notes in Computer Science(), vol 12132. Springer, Cham. https://doi.org/10.1007/978-3-030-50516-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-50516-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50515-8
Online ISBN: 978-3-030-50516-5
eBook Packages: Computer ScienceComputer Science (R0)