Adaptive Noise Cancellation for Speech Recorded During Magnetic Resonance Imaging: Formant Extraction and Analysis

Authors

  • Jack Laub Department of Mathematics and Systems Analysis, Western Norway University of Applied Sciences, Finland
  • Thomass Rofsky Department of Signal Processing and Acoustics, Western Norway University of Applied Sciences, Finland
  • Sara Weinreb
  • Dave Lee Department of Mathematics and Systems Analysis, Western Norway University of Applied Sciences, Finland

Keywords:

speech recording, magnetic resonance imaging, noise reduction, adaptive comb filtering, formant extraction, vocal tract, acoustic artifacts.

Abstract

Speech recordings obtained during Magnetic Resonance Imaging (MRI) of the upper airways often suffer from acoustic noise generated by the MRI scanner. This paper focuses on the post-processing of such speech samples using adaptive comb filtering to achieve accurate formant extraction. Two types of speech materials were used to validate the proposed algorithm: prolonged vowel productions recorded during MRI and comparison data recorded in an anechoic chamber. Spectral envelopes and vowel formants were computed from the post-processed speech and the comparison data. Additionally, numerical acoustic models and 3D printed vocal tract physical models were used for further analysis. The results reveal a significant frequency-dependent discrepancy between the vowel formant data obtained from recordings during MRI and the comparison data. This discrepancy is attributed to the acoustical changes caused by the surfaces of the MRI head coil, leading to "exterior formants" at frequencies around 1 kHz and 2 kHz. The observed discrepancy is too substantial to be disregarded when using MRI recordings for parameter estimation or validating numerical speech models based on MR images. However, the influence of test subject adaptation to noise and the effects of constrained space acoustics during an MRI examination cannot be ruled out.

References

Luukinen JM, Malinen J, Murtola T, Parkkola R, Saunavaara J. Large scale data acquisition of simultaneous MRI and speech. Applied Acoustics. 2014 Sep 1;83:64-75.

Birkholz P, Kürbis S, Stone S, Häsner P, Blandin R, Fleischer M. Printable 3D vocal tract shapes from MRI data and their acoustic and aerodynamic properties. Scientific data. 2020 Aug 5;7(1):255.

Arnela M, Dabbaghchian S, Blandin R, Guasch O, Engwall O, Van Hirtum A, Pelorson X. Influence of vocal tract geometry simplifications on the numerical simulation of vowel sounds. The Journal of the Acoustical Society of America. 2016 Sep 15;140(3):1707-18.

S. Izadi, K. Jabari, M. Izadi, B. Khadem Hamedani, and A. Ghaffari, "Identification and Diagnosis of Dynamic and Static Misalignment in Induction Motor Using Unscented Kalman Filter," in 2021 13th Iranian Conference on Electrical Engineering and Computer Science (ICEESC), 2021.

Alku P, Murtola T, Malinen J, Kuortti J, Story B, Airaksinen M, Salmi M, Vilkman E, Geneid A. OPENGLOT–An open environment for the evaluation of glottal inverse filtering. Speech Communication. 2019 Feb 1;107:38-47.

Brandner M, Blandin R, Frank M, Sontacchi A. A pilot study on the influence of mouth configuration and torso on singing voice directivity. The Journal of the Acoustical Society of America. 2020 Sep 4;148(3):1169-80.

Arnela M, Dabbaghchian S, Guasch O, Engwall O. MRI-based vocal tract representations for the three-dimensional finite element synthesis of diphthongs. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2019 Sep 19;27(12):2173-82.

Guasch O, Arnela M, Codina R, Espinoza H. A stabilized finite element method for the mixed wave equation in an ALE framework with application to diphthong production. Acta Acustica united with Acustica. 2016 Jan 1;102(1):94-106.

Schickhofer L, Malinen J, Mihaescu M. Compressible flow simulations of voiced speech using rigid vocal tract geometries acquired by MRI. The Journal of the Acoustical Society of America. 2019 Apr 16;145(4):2049-61.

Arnela M, Guasch O. Finite element synthesis of diphthongs using tuned two-dimensional vocal tracts. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2017 Aug 2;25(10):2013-23.

Freixes M, Arnela M, Socoró JC, Alías F, Guasch O. Glottal source contribution to higher order modes in the finite element synthesis of vowels. Applied Sciences. 2019 Oct 25;9(21):4535.

Schickhofer L, Mihaescu M. Analysis of the aerodynamic sound of speech through static vocal tract models of various glottal shapes. Journal of biomechanics. 2020 Jan 23;99:109484.

Downloads

Published

2021-08-28

How to Cite

Laub, J., Rofsky, T., Weinreb, S., & Lee, D. (2021). Adaptive Noise Cancellation for Speech Recorded During Magnetic Resonance Imaging: Formant Extraction and Analysis. Journal of Data-Driven Engineering Systems, 1(3ba08). Retrieved from https://esajournals.com/index.php/JDDES/article/view/6