Phone: +90 312 2107889 Address: Department of Modelling and Simulation, Informatics Institute, Middle East Technical University, Universiteler Mah., Dumlupinar Bulvari, No 1 Cankaya, 06800 Ankara, Turkey
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014
ABSTRACT Acoustic intensity is a vectorial measure of acoustic energy flow through a given region... more ABSTRACT Acoustic intensity is a vectorial measure of acoustic energy flow through a given region of interest. Three-dimensional measurement of acoustic intensity requires special microphone array configurations. This paper provides a theoretical analysis of open spherical microphone arrays for the 3-D measurement of acoustic intensity. The calculations of the pressure and the particle velocity components of the sound field inside a closed volume are expressed using the Kirchhoff-Helmholtz integral equation. The conditions which simplify the calculation are identified. This calculation is then constrained to a finite set of microphones positioned at prescribed points on an open sphere. Several open spherical array topologies are proposed. Their magnitude and directional errors and measurement bandwidths are investigated via numerical simulations. A comparison with conventional open-sphere 3-D intensity probes is presented.
2014 22nd Signal Processing and Communications Applications Conference (SIU), 2014
ABSTRACT Open-spherical acoustic intensity probes are microphone arrays based on the Kirchhoff-He... more ABSTRACT Open-spherical acoustic intensity probes are microphone arrays based on the Kirchhoff-Helmholtz integral and are used in the measurement of active acoustic intensity. The acoustic intensity measurements obtained by these arrays can be used to localise sound sources. Previously, the performance of these arrays in acoustic free field conditions were obtained using numerical simulations and it was shown that they provide better performance than other types of probes. This paper discusses the implementation of open spherical microphone arrays and their performance in reverberant enclosures.
Ieee Transactions on Signal Processing, Oct 1, 2007
One of the simplest ways of designing allpass fractional-delay filters with maximally flat group ... more One of the simplest ways of designing allpass fractional-delay filters with maximally flat group delays is by using the Thiran approximation by which the filter coefficients are calculated using a closed-form equation. However, due to the number of multiplications and divisions involved, the calculation of these coefficients is a computationally costly task and is not suitable for real-time applications. The analysis of a root-displacement-based interpolation method used in allpass tunable fractional delays is presented in this paper. The method allows continuous adjustments of the approximated fractional delay without the explicit calculation of a new set of filter coefficients. The transient error observed at the output due to the change of filter coefficients is analyzed. The direct and cascade implementations are compared with respect to their transient errors. An example application of the proposed method from the field of model-based sound synthesis is given.
An acoustic reverberator consisting of a network of delay lines connected via scattering junction... more An acoustic reverberator consisting of a network of delay lines connected via scattering junctions is proposed. All parameters of the reverberator are derived from physical properties of the enclosure it simulates. It allows for simulation of unequal and frequency-dependent wall absorption, as well as directional sources and microphones. The reverberator renders the first-order reflections exactly, while making progressively coarser approximations of higher-order reflections. The rate of energy decay is close to that obtained with the image method (IM) and consistent with the predictions of Sabine and Eyring equations. The time evolution of the normalized echo density, which was previously shown to be correlated with the perceived texture of reverberation, is also close to that of IM. However, its computational complexity is one to two orders of magnitude lower, comparable to the computational complexity of a feedback delay network (FDN), and its memory requirements are negligible.
ABSTRACT A simple, accurate and efficient variable fractional delay (VFD) IIR filter is presented... more ABSTRACT A simple, accurate and efficient variable fractional delay (VFD) IIR filter is presented. The method is based on the interpolation of pole loci of maximally-flat allpass fractional-delay filters. The responses of the proposed VFD filter for different fractional delay values are simulated and the approximation errors are shown to be very small.
2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009
Multichannel audio reproduction generally suffers from one or both of the following problems: i) ... more Multichannel audio reproduction generally suffers from one or both of the following problems: i) the recorded audio has to be artificially manipulated to provide the necessary spatial cues, which reduces the consistency of the reproduced sound field with the actual one, and ii) reproduction is not panoramic, which degrades realism when the listener is not seated in a desired ideal position facing the center channel. A recording method using a circularly symmetric array of differential microphones, and a reproduction method using a corresponding array of loudspeakers is presented in this paper. Design of microphone directivity patterns to achieve a panoramic auditory scene is discussed. Objective results in the form of active intensity diagrams are presented.
2011 Seventh International Conference on Signal Image Technology & Internet-Based Systems, 2011
Modelling, simulation and auralisation of room acoustics plays an important role in computer game... more Modelling, simulation and auralisation of room acoustics plays an important role in computer games and virtual reality applications by increasing the level of realism. Accurate simulation of room acoustics is a computationally costly process which is often substituted with artificial reverberators that provide a computationally simpler alternative. However, such systems lack the accuracy and are not in general able to accurately simulate important aspects of room acoustics such as early reflections, source/microphone directivity, and frequency-dependent absorption. A new type of interactive and scalable room simulator named the scattering delay network (SDN) was recently proposed by the authors. A frequencydomain analysis and implementation of that simulator is presented in this paper. Numerical simulation examples which demonstrate the utility of the proposed system are provided.
Various techniques have previously been proposed for the separation of convolutive mixtures. Thes... more Various techniques have previously been proposed for the separation of convolutive mixtures. These techniques can be classified as stochastic, adaptive, and deterministic. Stochastic methods are computationally expensive since they require an iterative process for the calculation of the demixing filters based on a separation criterion that usually assumes that the source signals are statistically independent. Adaptive methods, such as the adaptive beamformers, also exploit signal properties in order to optimize a multichannel filter structure. However, these algorithms need initialization and time to converge. Deterministic methods, on the other hand, provide a closed-form solution based on the deterministic aspects of the problem, such as the channel characteristics and the source directions. This paper presents a technique that exploits the intensity vector statistics to achieve a nearly closed-form solution for the separation of the convolutive mixtures as recorded with a coincident microphone array. No assumptions are made on the signals, but it is assumed that the source directions are known a priori. Directivity functions based on von Mises functions are designed for beamforming depending on the circular statistics of the calculated intensity vectors. Numerical evaluation results were presented for various speech and instrument sounds and source positions in two reverberant rooms.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015
An acoustic reverberator consisting of a network of delay lines connected via scattering junction... more An acoustic reverberator consisting of a network of delay lines connected via scattering junctions is proposed. All parameters of the reverberator are derived from physical properties of the enclosure it simulates. It allows for simulation of unequal and frequency-dependent wall absorption, as well as directional sources and microphones. The reverberator renders the first-order reflections exactly, while making progressively coarser approximations of higher-order reflections. The rate of energy decay is close to that obtained with the image method (IM) and consistent with the predictions of Sabine and Eyring equations. The time evolution of the normalized echo density, which was previously shown to be correlated with the perceived texture of reverberation, is also close to that of IM. However, its computational complexity is one to two orders of magnitude lower, comparable to the computational complexity of a feedback delay network (FDN), and its memory requirements are negligible.
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013
ABSTRACT Acoustic intensity can be used for different purposes such as sound source localisation,... more ABSTRACT Acoustic intensity can be used for different purposes such as sound source localisation, source separation and spatial audio object coding. Three-dimensional measurement of the acoustic intensity requires the design of special microphone arrays. A theoretical analysis and numerical simulations of intensity measurements using open spherical microphone arrays are presented in this paper. The calculation of the acoustic intensity using signals from an open spherical microphone array is presented first. Error metrics are defined to quantify the magnitude and directional errors. Isotropic operating range is defined as the upper frequency for which both errors are within prescribed bounds. Different array topologies are compared using numerical simulations.
Affective computing is a term for the design and development of algorithms that enable computers ... more Affective computing is a term for the design and development of algorithms that enable computers to recognize the emotions of their users and respond in a natural way. Speech, along with facial gestures, is one of the primary modalities with which humans express their emotions. While emotional cues in speech are available to an interlocutor in a dyadic conversation setting, their subjective recognition is far from accurate. This is due to the human auditory system which is primarily non-linear and adaptive. An automatic speech emotion recognition algorithm based on a computational model of the human auditory system is described in this paper. The devised system is tested on three emotional speech datasets. The results of a subjective recognition task is also reported. It is shown that the proposed algorithm provides recognition rates that are comparable to those of human raters.
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005., 2005
Head-related transfer function (HRTF) Þlters are used in virtual auditory displays for the binaur... more Head-related transfer function (HRTF) Þlters are used in virtual auditory displays for the binaural synthesis of the direction of a sound source over headphones. Once low-order HRTF Þlters are designed, the interpolation of these Þlters becomes an important issue for the synthesis of moving sound sources. An HRTF Þlter interpolation method based on the displacement of HRTF Þlter roots is proposed. It is possible to obtain a minimum-phase interpolated Þlter given that the original Þlters are also minimum-phase. The computational complexity of the method is the lower than that of the linear interpolation of magnitude responses.
One of the simplest ways of designing allpass fractional-delay filters with maximally flat group ... more One of the simplest ways of designing allpass fractional-delay filters with maximally flat group delays is by using the Thiran approximation by which the filter coefficients are calculated using a closed-form equation. However, due to the number of multiplications and divisions involved, the calculation of these coefficients is a computationally costly task and is not suitable for real-time applications. The analysis of a root-displacement-based interpolation method used in allpass tunable fractional delays is presented in this paper. The method allows continuous adjustments of the approximated fractional delay without the explicit calculation of a new set of filter coefficients. The transient error observed at the output due to the change of filter coefficients is analyzed. The direct and cascade implementations are compared with respect to their transient errors. An example application of the proposed method from the field of model-based sound synthesis is given.
IEEE Transactions on Audio, Speech, and Language Processing, 2000
Multichannel dereverberation amounts to the inversion of a multiple-input/multiple-output linear ... more Multichannel dereverberation amounts to the inversion of a multiple-input/multiple-output linear timeinvariant system. In this paper necessary and sufficient conditions for perfect dereverberation using stable and FIR filters are established. It is then shown that the inverse system given by the pseudoinverse of the original transfer function matrix exhibits a noise reduction property. A necessary and sufficient condition under which this pseudoinverse system is FIR is also given. Further, an FIR approximation to the pseudoinverse system is considered and the effects of the length of this approximation on the dereverberation accuracy are investigated. Finally, an analytical and numerical assessment of the dependence of the dereverberation accuracy on the accuracy of the acquisition of room impulse responses is provided.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014
ABSTRACT Acoustic intensity is a vectorial measure of acoustic energy flow through a given region... more ABSTRACT Acoustic intensity is a vectorial measure of acoustic energy flow through a given region of interest. Three-dimensional measurement of acoustic intensity requires special microphone array configurations. This paper provides a theoretical analysis of open spherical microphone arrays for the 3-D measurement of acoustic intensity. The calculations of the pressure and the particle velocity components of the sound field inside a closed volume are expressed using the Kirchhoff-Helmholtz integral equation. The conditions which simplify the calculation are identified. This calculation is then constrained to a finite set of microphones positioned at prescribed points on an open sphere. Several open spherical array topologies are proposed. Their magnitude and directional errors and measurement bandwidths are investigated via numerical simulations. A comparison with conventional open-sphere 3-D intensity probes is presented.
2014 22nd Signal Processing and Communications Applications Conference (SIU), 2014
ABSTRACT Open-spherical acoustic intensity probes are microphone arrays based on the Kirchhoff-He... more ABSTRACT Open-spherical acoustic intensity probes are microphone arrays based on the Kirchhoff-Helmholtz integral and are used in the measurement of active acoustic intensity. The acoustic intensity measurements obtained by these arrays can be used to localise sound sources. Previously, the performance of these arrays in acoustic free field conditions were obtained using numerical simulations and it was shown that they provide better performance than other types of probes. This paper discusses the implementation of open spherical microphone arrays and their performance in reverberant enclosures.
Ieee Transactions on Signal Processing, Oct 1, 2007
One of the simplest ways of designing allpass fractional-delay filters with maximally flat group ... more One of the simplest ways of designing allpass fractional-delay filters with maximally flat group delays is by using the Thiran approximation by which the filter coefficients are calculated using a closed-form equation. However, due to the number of multiplications and divisions involved, the calculation of these coefficients is a computationally costly task and is not suitable for real-time applications. The analysis of a root-displacement-based interpolation method used in allpass tunable fractional delays is presented in this paper. The method allows continuous adjustments of the approximated fractional delay without the explicit calculation of a new set of filter coefficients. The transient error observed at the output due to the change of filter coefficients is analyzed. The direct and cascade implementations are compared with respect to their transient errors. An example application of the proposed method from the field of model-based sound synthesis is given.
An acoustic reverberator consisting of a network of delay lines connected via scattering junction... more An acoustic reverberator consisting of a network of delay lines connected via scattering junctions is proposed. All parameters of the reverberator are derived from physical properties of the enclosure it simulates. It allows for simulation of unequal and frequency-dependent wall absorption, as well as directional sources and microphones. The reverberator renders the first-order reflections exactly, while making progressively coarser approximations of higher-order reflections. The rate of energy decay is close to that obtained with the image method (IM) and consistent with the predictions of Sabine and Eyring equations. The time evolution of the normalized echo density, which was previously shown to be correlated with the perceived texture of reverberation, is also close to that of IM. However, its computational complexity is one to two orders of magnitude lower, comparable to the computational complexity of a feedback delay network (FDN), and its memory requirements are negligible.
ABSTRACT A simple, accurate and efficient variable fractional delay (VFD) IIR filter is presented... more ABSTRACT A simple, accurate and efficient variable fractional delay (VFD) IIR filter is presented. The method is based on the interpolation of pole loci of maximally-flat allpass fractional-delay filters. The responses of the proposed VFD filter for different fractional delay values are simulated and the approximation errors are shown to be very small.
2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009
Multichannel audio reproduction generally suffers from one or both of the following problems: i) ... more Multichannel audio reproduction generally suffers from one or both of the following problems: i) the recorded audio has to be artificially manipulated to provide the necessary spatial cues, which reduces the consistency of the reproduced sound field with the actual one, and ii) reproduction is not panoramic, which degrades realism when the listener is not seated in a desired ideal position facing the center channel. A recording method using a circularly symmetric array of differential microphones, and a reproduction method using a corresponding array of loudspeakers is presented in this paper. Design of microphone directivity patterns to achieve a panoramic auditory scene is discussed. Objective results in the form of active intensity diagrams are presented.
2011 Seventh International Conference on Signal Image Technology & Internet-Based Systems, 2011
Modelling, simulation and auralisation of room acoustics plays an important role in computer game... more Modelling, simulation and auralisation of room acoustics plays an important role in computer games and virtual reality applications by increasing the level of realism. Accurate simulation of room acoustics is a computationally costly process which is often substituted with artificial reverberators that provide a computationally simpler alternative. However, such systems lack the accuracy and are not in general able to accurately simulate important aspects of room acoustics such as early reflections, source/microphone directivity, and frequency-dependent absorption. A new type of interactive and scalable room simulator named the scattering delay network (SDN) was recently proposed by the authors. A frequencydomain analysis and implementation of that simulator is presented in this paper. Numerical simulation examples which demonstrate the utility of the proposed system are provided.
Various techniques have previously been proposed for the separation of convolutive mixtures. Thes... more Various techniques have previously been proposed for the separation of convolutive mixtures. These techniques can be classified as stochastic, adaptive, and deterministic. Stochastic methods are computationally expensive since they require an iterative process for the calculation of the demixing filters based on a separation criterion that usually assumes that the source signals are statistically independent. Adaptive methods, such as the adaptive beamformers, also exploit signal properties in order to optimize a multichannel filter structure. However, these algorithms need initialization and time to converge. Deterministic methods, on the other hand, provide a closed-form solution based on the deterministic aspects of the problem, such as the channel characteristics and the source directions. This paper presents a technique that exploits the intensity vector statistics to achieve a nearly closed-form solution for the separation of the convolutive mixtures as recorded with a coincident microphone array. No assumptions are made on the signals, but it is assumed that the source directions are known a priori. Directivity functions based on von Mises functions are designed for beamforming depending on the circular statistics of the calculated intensity vectors. Numerical evaluation results were presented for various speech and instrument sounds and source positions in two reverberant rooms.
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015
An acoustic reverberator consisting of a network of delay lines connected via scattering junction... more An acoustic reverberator consisting of a network of delay lines connected via scattering junctions is proposed. All parameters of the reverberator are derived from physical properties of the enclosure it simulates. It allows for simulation of unequal and frequency-dependent wall absorption, as well as directional sources and microphones. The reverberator renders the first-order reflections exactly, while making progressively coarser approximations of higher-order reflections. The rate of energy decay is close to that obtained with the image method (IM) and consistent with the predictions of Sabine and Eyring equations. The time evolution of the normalized echo density, which was previously shown to be correlated with the perceived texture of reverberation, is also close to that of IM. However, its computational complexity is one to two orders of magnitude lower, comparable to the computational complexity of a feedback delay network (FDN), and its memory requirements are negligible.
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013
ABSTRACT Acoustic intensity can be used for different purposes such as sound source localisation,... more ABSTRACT Acoustic intensity can be used for different purposes such as sound source localisation, source separation and spatial audio object coding. Three-dimensional measurement of the acoustic intensity requires the design of special microphone arrays. A theoretical analysis and numerical simulations of intensity measurements using open spherical microphone arrays are presented in this paper. The calculation of the acoustic intensity using signals from an open spherical microphone array is presented first. Error metrics are defined to quantify the magnitude and directional errors. Isotropic operating range is defined as the upper frequency for which both errors are within prescribed bounds. Different array topologies are compared using numerical simulations.
Affective computing is a term for the design and development of algorithms that enable computers ... more Affective computing is a term for the design and development of algorithms that enable computers to recognize the emotions of their users and respond in a natural way. Speech, along with facial gestures, is one of the primary modalities with which humans express their emotions. While emotional cues in speech are available to an interlocutor in a dyadic conversation setting, their subjective recognition is far from accurate. This is due to the human auditory system which is primarily non-linear and adaptive. An automatic speech emotion recognition algorithm based on a computational model of the human auditory system is described in this paper. The devised system is tested on three emotional speech datasets. The results of a subjective recognition task is also reported. It is shown that the proposed algorithm provides recognition rates that are comparable to those of human raters.
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005., 2005
Head-related transfer function (HRTF) Þlters are used in virtual auditory displays for the binaur... more Head-related transfer function (HRTF) Þlters are used in virtual auditory displays for the binaural synthesis of the direction of a sound source over headphones. Once low-order HRTF Þlters are designed, the interpolation of these Þlters becomes an important issue for the synthesis of moving sound sources. An HRTF Þlter interpolation method based on the displacement of HRTF Þlter roots is proposed. It is possible to obtain a minimum-phase interpolated Þlter given that the original Þlters are also minimum-phase. The computational complexity of the method is the lower than that of the linear interpolation of magnitude responses.
One of the simplest ways of designing allpass fractional-delay filters with maximally flat group ... more One of the simplest ways of designing allpass fractional-delay filters with maximally flat group delays is by using the Thiran approximation by which the filter coefficients are calculated using a closed-form equation. However, due to the number of multiplications and divisions involved, the calculation of these coefficients is a computationally costly task and is not suitable for real-time applications. The analysis of a root-displacement-based interpolation method used in allpass tunable fractional delays is presented in this paper. The method allows continuous adjustments of the approximated fractional delay without the explicit calculation of a new set of filter coefficients. The transient error observed at the output due to the change of filter coefficients is analyzed. The direct and cascade implementations are compared with respect to their transient errors. An example application of the proposed method from the field of model-based sound synthesis is given.
IEEE Transactions on Audio, Speech, and Language Processing, 2000
Multichannel dereverberation amounts to the inversion of a multiple-input/multiple-output linear ... more Multichannel dereverberation amounts to the inversion of a multiple-input/multiple-output linear timeinvariant system. In this paper necessary and sufficient conditions for perfect dereverberation using stable and FIR filters are established. It is then shown that the inverse system given by the pseudoinverse of the original transfer function matrix exhibits a noise reduction property. A necessary and sufficient condition under which this pseudoinverse system is FIR is also given. Further, an FIR approximation to the pseudoinverse system is considered and the effects of the length of this approximation on the dereverberation accuracy are investigated. Finally, an analytical and numerical assessment of the dependence of the dereverberation accuracy on the accuracy of the acquisition of room impulse responses is provided.
Uploads
Papers by Huseyin Hacihabiboglu