Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Brian F G Katz
  • Paris, France
The use of room acoustic auralizations has increased due to the improving computing power available and the quality of numerical modelling software. In such auralizations, it is often possible to prescribe the directivity of an acoustic... more
The use of room acoustic auralizations has increased due to the improving computing power available and the quality of numerical modelling software. In such auralizations, it is often possible to prescribe the directivity of an acoustic source in order to better represent the way in which a given acoustic source excites the room. However, such directivities are static, being defined according to source excitation as a function of frequency for the numerical simulation. While sources such as pianos vary little over the course of playing, it is known that voice directivity varies, sometimes considerably, due to both phoneme dependent radiation patterns [1] linked to changes in mouth geometry and dynamic orientation.
Advances in computational power have opened the doors to higher resolution acoustic modelling for large-scale spaces where acoustics is crucial and spaces are increasingly complicated. As such, auralizations are becoming more prevalent in... more
Advances in computational power have opened the doors to higher resolution acoustic modelling for large-scale spaces where acoustics is crucial and spaces are increasingly complicated. As such, auralizations are becoming more prevalent in architectural acoustics and virtual reality. However, there have been few studies examining the perceptual quality achievable by room acoustic simulations and auralizations. This paper presents a summary of several recent studies involving the evaluation, objectively with regards to acoustic parameters, and perceptively through listening tests, of room acoustic simulations where subjective equivalency to reality was the driving force. Presented studies involve the elaboration of a calibration method for simulations, inclusion of dynamic source directivity characteristics, and the assessment of various simulation methodologies in the context of coupled volumes. These studies were carried out using existing spaces in order to have a real reference. R...
 This study presents an exploration task using interactive sonification to compare different sonification mapping concepts. Based on the real application of protein-protein docking within the CoRSAIRe project («Combinaisons de Rendus... more
 This study presents an exploration task using interactive sonification to compare different sonification mapping concepts. Based on the real application of protein-protein docking within the CoRSAIRe project («Combinaisons de Rendus Sensori-moteurs pour l'Analyse Immersive de Résultats», or Combination of sensori-motor rendering for the immersive analysis of results), an abstraction of the task was developed which simulates the basic concepts involved. Two conditions were evaluated, the inclusion or absence of spatialized ...
This article presents a case study of higher-order Ambisonics (HOA) for real-time sound field reproduction in a small room with a 157-loudspeaker array. It addresses a number of specific questions and practical issues on the system design... more
This article presents a case study of higher-order Ambisonics (HOA) for real-time sound field reproduction in a small room with a 157-loudspeaker array. It addresses a number of specific questions and practical issues on the system design and implementation, such as the reproduction room's acoustic, loudspeaker positioning and radiation patterns, distributed computing and audio channel synchronization, and in more general the achievable accuracy of sound field reproduction. In the current configuration of the system Ambisonics up to order n = 6 is applied and the decoders are rendered in parallel on a cluster of four computers. For this reason, synchronization and communication between the different computers becomes a challenging task for achieving a good system performance. The overall system latency and the inter-channel synchronicity have been measured using time-stretched pulse (TSP) signals. The measurement results have shown a maximum (unsigned) latency of 51 samples, which corresponds to t = 1.1 ms. It is obvious that the acoustic of the reproduction room has a strong effect on the accuracy of the Ambisonics sound field reproduction. To achieve semi-anechoic conditions sound absorption materials have been installed in the room. Finally, spatial filters have been applied to each individual loudspeaker to correct for different orientations with reference to the sweet spot. These filters have been derived from radiation pattern measurements in an anechoic chamber.
Research Interests:
The Cathedrale Notre-Dame de Paris is amongst the most well-known worship spaces in the world. Its large volume, in combination with a relatively bare stone construction and marble floor, leads to rather long reverberation times. The... more
The Cathedrale Notre-Dame de Paris is amongst the most well-known worship spaces in the world. Its large volume, in combination with a relatively bare stone construction and marble floor, leads to rather long reverberation times. The cathedral suffered from a significant fire in 2019, resulting in damage primarily to the roof and vaulted ceiling. Despite the notoriety of this space, there are few examples of published data on the acoustical parameters of this space, and these data do not agree. Archived measurement recordings from 1987 were recovered and found to include several balloon bursts. In 2015, a measurement session was carried out for a virtual reality project. Comparisons between results from these two sessions show a slight but significant decrease in reverberation time (8%) in the pre-fire state. Measurements were recently carried out on the construction site, 1 year since the fire. Compared to 2015 data, the reverberation time significantly decreased (20%). This paper ...
The Cathedrale Notre-Dame de Paris is amongst the most well-known worship spaces in the world. Its large volume, in combination with a relatively bare stone construction and marble floor, leads to rather long reverberation times. The... more
The Cathedrale Notre-Dame de Paris is amongst the most well-known worship spaces in the world. Its large volume, in combination with a relatively bare stone construction and marble floor, leads to rather long reverberation times. The cathedral suffered from a significant fire in 2019, resulting in damage primarily to the roof and vaulted ceiling. Despite the notoriety of this space, there are few examples of published data on the acoustical parameters of this space, and these data do not agree. Archived measurement recordings from 1987 were recovered and found to include several balloon bursts. In 2015, a measurement session was carried out for a virtual reality project. Comparisons between results from these two sessions show a slight but significant decrease in reverberation time (8%) in the pre-fire state. Measurements were recently carried out on the construction site, 1 year since the fire. Compared to 2015 data, the reverberation time significantly decreased (20%). This paper ...
The Inter-aural Time Difference (ITD) is a fundamental cue for human sound localization. Over the past decades, several methods have been proposed for its estimation from measured Head-Related Impulse Response (HRIR) data. Nevertheless,... more
The Inter-aural Time Difference (ITD) is a fundamental cue for human sound localization. Over the past decades, several methods have been proposed for its estimation from measured Head-Related Impulse Response (HRIR) data. Nevertheless, inter-method variations in ITD calculation have been found to exceed the known Just Noticeable Differences (JNDs), hence leading to possible perceptible artifacts in virtual binaural auditory scenes, even for cases when personalized HRIRs are being used. In the absence of an objective means for validating ITD estimations, this paper evaluates which methods lead to the most perceptually relevant results. A subjective lateralization study compared objective ITDs to perceptually driven inter-aural pure delay offsets. Results clearly indicate the first-onset Threshold detection method, using a low relative threshold of -30 dB, applied on 3 kHz low-pass filtered HRIRs as the most perceptually relevant procedure across various metrics. Alternative threshold values and methods ba...
As part of the 850-year anniversary of Notre-Dame cathedral, Paris, there was a special performance of “La Vierge.” A close-mic recording of the concert was made by the Conservatoire de Paris. In an attempt to provide a new type of... more
As part of the 850-year anniversary of Notre-Dame cathedral, Paris, there was a special performance of “La Vierge.” A close-mic recording of the concert was made by the Conservatoire de Paris. In an attempt to provide a new type of experience, a virtual recreation of the performance using these roughly 45 audio channels was made via auralization. A computational acoustic model was created and calibrated based on in-situ measurements for reverberation and clarity parameters. A perceptual study with omnidirectional source and binaural receiver validated the calibrated simulation for the tested subjective attributes of reverberation, clarity, source distance, tonal balance, coloration, plausibility, ASW, and LEV when compared to measured responses. Instrument directivity was included for each track's representative orchestral section based on published data. Higher-Order Ambisonic (3rd order) RIRs were generated for all source and receiver combinations using the CATT-Acoustic TUCT software. Virtual navigatio...
The inter-aural time difference (ITD) is a fundamental cue for human sound localization. Over the past decades several methods have been proposed for its estimation from measured head-related impulse response (HRIR) data. Nevertheless,... more
The inter-aural time difference (ITD) is a fundamental cue for human sound localization. Over the past decades several methods have been proposed for its estimation from measured head-related impulse response (HRIR) data. Nevertheless, inter-method variations in ITD calculation have been found to exceed the known just noticeable differences (JNDs), leading to possible perceptible artifacts in virtual binaural auditory scenes, when personalized HRIRs are being used. In the absence of an objective means for validating ITD estimations, this paper examines which methods lead to the most perceptually relevant results. A subjective lateralization study compared objective ITDs to perceptually evaluated inter-aural pure delay offsets. Results clearly indicate the first-onset threshold detection method, using a low relative threshold of -30 dB, applied on 3 kHz low-pass filtered HRIRs as consistently the most perceptually relevant procedure across various metrics. Several alternative thresho...
ABSTRACT This paper addresses the use of audio and haptics as a mean to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic... more
ABSTRACT This paper addresses the use of audio and haptics as a mean to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic interactions for the acquisition of a target of interest in an environment containing multiple and obscured distractors. A first study compares means for identifying and locating a specified target among others employing either audio, haptic, or both sensori-motor channels activated simultaneously. Following an analysis of the results and subject comments, an improved multimodal approach is proposed and evaluated in a second study, combining advantages offered by each sensory channel. Results confirm the efficiency and effectiveness of the proposed multimodal approach.
In the absence of a well suited measure for quantifying binaural data variations, this study presents the use of a global perceptual distance metric which can describe both HRTF as well as listener similarities. The metric is derived... more
In the absence of a well suited measure for quantifying binaural data variations, this study presents the use of a global perceptual distance metric which can describe both HRTF as well as listener similarities. The metric is derived based on subjective evaluations of binaural renderings of a sound moving along predefined trajectories in the horizontal and median planes. Its characteristics and advantages in describing data distributions based on perceptually relevant attributes are discussed. In addition, the use of 24 HRTFs from two different databases of origin allows for an evaluation of the perceptual impact of some database-dependent characteristics on spatialization. The effectiveness of the experimental design as well as the correlation between the HRTF evaluations of the two plane trajectories are also discussed.
Research Interests:
ABSTRACT This paper addresses the use of audio and haptics as a mean to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic... more
ABSTRACT This paper addresses the use of audio and haptics as a mean to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic interactions for the acquisition of a target of interest in an environment containing multiple and obscured distractors. A first study compares means for identifying and locating a specified target among others employing either audio, haptic, or both sensori-motor channels activated simultaneously. Following an analysis of the results and subject comments, an improved multimodal approach is proposed and evaluated in a second study, combining advantages offered by each sensory channel. Results confirm the efficiency and effectiveness of the proposed multimodal approach.
Abstract This paper presents the use of audio and haptic feedbacks to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic cues... more
Abstract This paper presents the use of audio and haptic feedbacks to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic cues for the acquisition of a desired target in an environment containing multiple and obscured distractors. This study compares different ways of identifying and locating a specified target among others by the mean of either audio, haptic, or both feedbacks rendered simultaneously. The analysis of results ...
Abstract This paper presents the use of audio and haptic feedbacks to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic cues... more
Abstract This paper presents the use of audio and haptic feedbacks to reduce the load of the visual channel in interaction tasks within virtual environments. An examination is made regarding the exploitation of audio and/or haptic cues for the acquisition of a desired target in an environment containing multiple and obscured distractors. This study compares different ways of identifying and locating a specified target among others by the mean of either audio, haptic, or both feedbacks rendered simultaneously. The analysis of results ...
ABSTRACT A methodology based on the linguistic exploration of spontaneous descriptions was developed to investigate the sound quality of urban environments. A mail sur-vey was first conducted to better understand how low frequencies... more
ABSTRACT A methodology based on the linguistic exploration of spontaneous descriptions was developed to investigate the sound quality of urban environments. A mail sur-vey was first conducted to better understand how low frequencies affect people in their everyday life. Three other experiments were carried out to explore the eco-logical validity of experimental settings in laboratory conditions. The reference study consisted of interviews conducted in actual environments, which were also recorded simultaneously. The recordings were used for two listening tests, the first one using stereophonic reproduction and the second one using multichannel re-production. The comparison of the verbal data collected in the different contexts sketches some theoretical and methodological issues concerning the reproduction of everyday life scenes in laboratory conditions. The linguistic analyses indicate that the "same" acoustic phenomenon gives rise to different cognitive represen-tations, depending on the spatial presentation of the stimuli. It follows that the quality of the reproduction system must be adapted to specific properties of mental representations (here, spatial immersion vs. source identification). The analysis 1 Work performed while at the Laboratoire d'Acoustique Musicale, Université Paris 6, CNRS (UMR 7604), 11 rue de Lourmel, F75015 Paris, France.
Haptic and auditory cues have often been used to help reduce the cognitive load of the visual channel for the exploration of large data sets. Data exploration includes many different tasks. The current study investigates a target... more
Haptic and auditory cues have often been used to help reduce the cognitive load of the visual channel for the exploration of large data sets. Data exploration includes many different tasks. The current study investigates a target selection task in which only haptic and auditory feedbacks are rendered to the user, without visual rendering. Haptic and auditory information provides guidance cues to the user for arriving at a given spatial target position. The haptic rendering acts like a virtual magnet that physically attracts the user ...
Today there are 2 major evolutions in spatial audio. First, an enhanced 3D audio experience, where virtual sound sources can be accurately synthesized in any direction, is possible with technologies such as binaural, Wave Field Synthesis,... more
Today there are 2 major evolutions in spatial audio. First, an enhanced 3D audio experience, where virtual sound sources can be accurately synthesized in any direction, is possible with technologies such as binaural, Wave Field Synthesis, Higher Order Ambisonics or Vector Base Amplitude Panning. Second, 3D audio is on the way to being democratized through binaural adaptation for headphone listening. These evolutions call for revisiting the methods and tools used to assess the perception of spatial sound reproduction. The first objective of this paper is to delineate the problem, by exploring the potential dimensions and the related attributes underlying the perception of spatial sound, mainly within the context of binaural reproduction. Secondly, assessment methods, including both standard and less conventional ones, are listed, and their relevance for the measure of the attributes previously identified is discussed.
The use of acoustical scale models has been replaced for the most part by computational models and numerical simulations for room acoustic studies as well as artificial reverberation units. There remains however a number of acoustical... more
The use of acoustical scale models has been replaced for the most part by computational models and numerical simulations for room acoustic studies as well as artificial reverberation units. There remains however a number of acoustical phenomena which are difficult to address with computer simulations, such as coupled volumes, diffraction, and complex scattering, due to the computational complexity and/or calculation time necessary for addressing such acoustical wave phenomena on the scale of room acoustical problems, even small rooms. This paper presents a pilot study of a rather unique artistic architectural structure consisting of a self-supporting construction composed of small stacked linear elements. Acoustically, the structure combines modal behavior, concave forms, and very regular scattering patterns. An example scale model has been constructed and studied in order to separate different construction features and their associated acoustics effects. In an attempt to explore th...
Notre-Dame de Paris is amongst the most well-known worship spaces in the world. Its large volume, in combination with a relatively bare stone construction and marble floor, leads to rather long reverberation times. Despite the notoriety... more
Notre-Dame de Paris is amongst the most well-known worship spaces in the world. Its large volume, in combination with a relatively bare stone construction and marble floor, leads to rather long reverberation times. Despite the notoriety of this space, there are few examples of published data on the acoustical parameters of this space, and these data are often not in agreement. Archived measurement recordings from 1987 were recovered and found to include several balloon bursts. In 2015, a measurement session was carried out which included similar source-receiver pairs using both balloon bursts and swept sine stimuli. Comparisons between results from these two sessions show a significant decrease in reverberation time in the modern state. This change is attributed to the addition of carpet in several areas of the cathedral. A geometrical acoustics model of the cathedral was constructed and calibrated from the 2015 measurements. The effect of carpeting was investigated through simulati...
Perceptual differences between sound reproduction systems with multiple spatial dimensions have been investigated. Two blind studies were performed using system configurations involving 1-D, 2-D, and 3-D loudspeaker arrays. Various types... more
Perceptual differences between sound reproduction systems with multiple spatial dimensions have been investigated. Two blind studies were performed using system configurations involving 1-D, 2-D, and 3-D loudspeaker arrays. Various types of source material were used, ranging from urban soundscapes to musical passages. Experiment I consisted in collecting subjects' perceptions in a free-response format to identify relevant criteria for multi-dimensional spatial sound reproduction of complex auditory scenes by means of linguistic analysis. Experiment II utilized both free response and scale judgments for seven parameters derived form Experiment I. Results indicated a strong correlation between the source material ͑sound scene͒ and the subjective evaluation of the parameters, making the notion of an ''optimal'' reproduction method difficult for arbitrary source material.
This article presents a case study of higher-order Ambisonics (HOA) for real-time sound field reproduction in a small room with a 157-loudspeaker array. It addresses a number of specific questions and practical issues on the system design... more
This article presents a case study of higher-order Ambisonics (HOA) for real-time sound field reproduction in a small room with a 157-loudspeaker array. It addresses a number of specific questions and practical issues on the system design and implementation, such as the reproduction room's acoustic, loudspeaker positioning and radiation patterns, distributed computing and audio channel synchronization, and in more general the achievable accuracy of sound field reproduction. In the current configuration of the system Ambisonics up to order n = 6 is applied and the decoders are rendered in parallel on a cluster of four computers. For this reason, synchronization and communication between the different computers becomes a challenging task for achieving a good system performance. The overall system latency and the inter-channel synchronicity have been measured using time-stretched pulse (TSP) signals. The measurement results have shown a maximum (unsigned) latency of 51 samples, whi...
This paper presents the details related to the creation of a public database of anechoic audio and 3D-video recordings of several small music ensemble performances. Musical extracts range from baroque to jazz music. This work aims at... more
This paper presents the details related to the creation of a public database of anechoic audio and 3D-video recordings of several small music ensemble performances. Musical extracts range from baroque to jazz music. This work aims at extending the already available public databases of anechoic stimuli, providing the community with flexible audiovisual content for virtual acoustic simulations. For each piece of music, musicians were first close-mic recorded together to provide an audio performance reference. This recording was followed by individual instrument retake recordings, while listening to the reference recording, to achieve the best audio separation between instruments. In parallel, 3D-video content was recorded for each musician, employing a multiple Kinect 2 RGB-Depth sensors system, allowing for the generation and easy manipulation of 3D point-clouds. Details of the choice of musical pieces, recording procedure, and technical details on the system architecture including p...
We propose a new method for measuring the threshold of 50% sentence intelligibility in noisy or multi-source speech communication situations (Speech Reception Threshold, SRT). Our SRT-test complements those available e.g. for English,... more
We propose a new method for measuring the threshold of 50% sentence intelligibility in noisy or multi-source speech communication situations (Speech Reception Threshold, SRT). Our SRT-test complements those available e.g. for English, German, Dutch, Swedish and Finnish by a French test method. The approach we take is based on semantically unpredictable sentences (SUS), which can principally be created for various languages. This way, the proposed method enables better cross-language comparisons of intelligibility tests. As a starting point for the French language, a set of 288 sentences (24 lists of 12 sentences each) was created. Each of the 24 lists is optimized for homogeneity in terms of phoneme-distribution as compared to average French, and for word occurrence frequency of the employed monosyllabic keywords as derived from French language databases. Based on the optimized text material, a speech target sentence database has been recorded with a trained speaker. A test calibrat...
By convolving an audio stream with a given pair of impulse responses between a source position and the two ears, virtual sound scenes can be created over headphones. Typically, the set of these filters for an ensemble of spatial... more
By convolving an audio stream with a given pair of impulse responses between a source position and the two ears, virtual sound scenes can be created over headphones. Typically, the set of these filters for an ensemble of spatial positions, termed the Head-Related Impulse Response (HRIR) is used to render position information of a sound object to a listener. However, HRIRs are measured in free-field conditions, ignoring room reflections. In the real world, multiple reflections and reverberation exist, producing complex rich sound spaces. Including room reflections and reverberation with the HRIR results in a binaural room impulse response (BRIR). The length of a given BRIR depend on the shape and volume of the room, with BRIRs having typical duration of several seconds, resulting in computationally long processing. When the virtual environment is updated in response to head/body movement, BRIRs need to be updated according to the relative direction of a sound object within the percep...
Prior to Sabine’s work on the Fogg Art Museum and Boston Symphony Hall, several numerical guidelines had been developed and applied to the design of rooms with specific acoustic demands such as theatres, concert halls, and opera houses.... more
Prior to Sabine’s work on the Fogg Art Museum and Boston Symphony Hall, several numerical guidelines had been developed and applied to the design of rooms with specific acoustic demands such as theatres, concert halls, and opera houses. Previous papers have discussed guidelines based on the following principles: voice directivity, which was employed in the design of at least 11 rooms; “echo theory”, which quantifies the perception threshold between direct sound and first order reflections in order to prevent echoes from occurring, aiding in the design of at least 7 rooms and leading to the first known use of an acoustic scale model; and notions of reverberation, which influenced the design of at least 14 rooms. This paper discusses three additional pre-Sabine numerical guidelines that were used in room acoustic design: (1) audience rake, (2) stage acoustics and proscenium design, and (3) length, width, and height ratios. The origin of these theories, as well as examples of rooms in ...
Dynamic changes of the Head-Related Transfer Function renderings as a function of head movement have been shown to be an important cue in sound localization. To investigate the cognitive process of dynamic sound localization,... more
Dynamic changes of the Head-Related Transfer Function renderings as a function of head movement have been shown to be an important cue in sound localization. To investigate the cognitive process of dynamic sound localization, quantification of the characteristics of head movements is needed. In this study, trajectories of head rotation in a sound localization task were measured and analyzed. Listeners were asked to orient themselves towards the direction of active sound source via localization, being one of five loudspeakers located at 30 • intervals in the horizontal plane. A 1 s pink noise burst stimulus was emitted from different speakers in random order. The range of expected head rotations (EHR) for a given stimulus were, therefore, from 30 • to 120 •. Head orientation was measured with a motion capture system (yaw, pitch, and roll). Analysis examined angular velocity, overshoot, and reaction time (RT). Results show that angular velocity increased as EHR increased. No relations...
The intimacy of many historic European Opera Houses, especially of the traditional Italian style, is highly cherished and many of these halls are considered to be among the best halls acoustically. From an acoustical point of view the... more
The intimacy of many historic European Opera Houses, especially of the traditional Italian style, is highly cherished and many of these halls are considered to be among the best halls acoustically. From an acoustical point of view the generally small dimensions often combined with a moderate seat count provide excellent source presence and clarity. On the other hand, the corresponding small volume leads to short reverberation times, and in recent decades higher reverberation times have been preferred and asked for by clients and audiences in many countries. Ideas will be presented on how this apparent dilemma between the preference for small dimensions (for intimacy, source presence and definition) and increased volume (in order to create longer reverberation times) can be addressed. Acoustics 08 Paris
This paper presents the current state of the BlenderCAVE project, which extends the 3D creation content software Blender and its Game Engine (BGE) to Virtual Reality (VR) applications. BlenderCAVE integrates a complete framework dedicated... more
This paper presents the current state of the BlenderCAVE project, which extends the 3D creation content software Blender and its Game Engine (BGE) to Virtual Reality (VR) applications. BlenderCAVE integrates a complete framework dedicated to Virtual Reality, compatible with the three main Operating Systems for any given VR architecture configuration. Acting as a Scene Graph, BlenderCAVE handles multi-screen/multi-user tracked stereoscopic rendering through an efficient low-level master/slave synchronization process while controlling spatial audio rendering events through OSC and VRPN protocols. BlenderCAVE allows VR users to benefit from the high level scene editing, game logic, and physics engine of Blender as well as Blender’s large user community.
This paper presents the BlenderCAVE project, which extends the 3D creation content software Blender and its Game Engine (BGE) to Virtual Reality (VR) applications. Based on a multi-screen nonstereoscopic adaptation of the BGE [Gascon et... more
This paper presents the BlenderCAVE project, which extends the 3D creation content software Blender and its Game Engine (BGE) to Virtual Reality (VR) applications. Based on a multi-screen nonstereoscopic adaptation of the BGE [Gascon et al., 2010], BlenderCAVE now integrates a complete framework dedicated to Virtual Reality (VR), compatible with the three main Operating Systems for any given VR architecture configuration. It has been developed by audio and VR researchers with support from the Blender Community on LIMSI’s state of the art VR platforms. Acting as a Scene Graph, BlenderCAVE handles multi-screen/multi-user tracked stereoscopic rendering through an efficient low-level master/slave synchronization process while controlling spatial audio rendering (ambisonic, multi-user binaural, WFS, etc.) and haptic events through OSC and VRPN protocols. The scene creation process itself is reduced to simple Blender manipulations including basic python programming easily carried out usin...
Prior to Sabine’s work on the Fogg Art Museum and Boston Symphony Hall, several numerical guidelines had been developed and applied to the design of rooms with specific acoustic demands such as theatres, concert halls, and opera houses.... more
Prior to Sabine’s work on the Fogg Art Museum and Boston Symphony Hall, several numerical guidelines had been developed and applied to the design of rooms with specific acoustic demands such as theatres, concert halls, and opera houses. Previous papers have discussed guidelines based on the following principles: voice directivity, which was employed in the design of at least 11 rooms; “echo theory”, which quantifies the perception threshold between direct sound and first order reflections in order to prevent echoes from occurring, aiding in the design of at least 7 rooms and leading to the first known use of an acoustic scale model; and notions of reverberation, which influenced the design of at least 14 rooms. This paper discusses three additional pre-Sabine numerical guidelines that were used in room acoustic design: (1) audience rake, (2) stage acoustics and proscenium design, and (3) length, width, and height ratios. The origin of these theories, as well as examples of rooms in ...

And 184 more

In the absence of a well suited measure for quantifying binaural data variations, this study presents the use of a global perceptual distance metric which can describe both HRTF as well as listener similarities. The metric is derived... more
In the absence of a well suited measure for quantifying binaural data variations, this study presents the use of a global perceptual distance metric which can describe both HRTF as well as listener similarities. The metric is derived based on subjective evaluations of binaural renderings of a sound moving along predefined trajectories in the horizontal and median planes. Its characteristics and advantages in describing data distributions based on perceptually relevant attributes are discussed. In addition, the use of 24 HRTFs from two different databases of origin allows for an evaluation of the perceptual impact of some database-dependent characteristics on spatialization. The effectiveness of the experimental design as well as the correlation between the HRTF evaluations of the two plane trajectories are also discussed.
This paper investigates the repeatability of an HRTF evaluation protocol, assessing the spatial quality of binaural stimuli, moving along pre-defined trajectories on the horizontal and median planes, on a forced-choice 9-point rating... more
This paper investigates the repeatability of an HRTF evaluation protocol, assessing the spatial quality of binaural stimuli, moving along pre-defined trajectories on the horizontal and median planes, on a forced-choice 9-point rating scale. The protocol assessment was based on data simulations and subjective studies. Repeatability was evaluated as a function of the size and content of the HRTF corpus, the trajectories, and the resolution of the rating scale. Analysis of the data revealed that HRTF rating is a reliable, yet, challenging task with low repeatability rates of ≈ 50%. Therefore participant screening through pre-tests should be used to maximize reliability of the responses.
In the absence of a well-suited measure for quantifying binaural data variations, this study presents a global perceptual distance metric which can describe both HRTF and listener similarities. The metric is derived based on subjective... more
In the absence of a well-suited measure for quantifying binaural data variations, this study presents a global perceptual distance metric which can describe both HRTF and listener similarities. The metric is derived based on subjective evaluations of binaural renderings of a sound moving along predefined trajectories on the horizontal and median planes. Its characteristics and advantages in describing data distributions based on perceptually relevant attributes are discussed. In addition, the use of 24 HRTFs from two different databases of origin allows for an evaluation of the perceptual impact of some database-dependent characteristics on binaural spatialization. The effectiveness of the experimental design and the correlation between the HRTF evaluations of the two plane trajectories are also discussed.
The inter-aural time difference (ITD) is a fundamental cue for human sound localization. Over the past decades several methods have been proposed for its estimation from measured head-related impulse response (HRIR) data. Nevertheless,... more
The inter-aural time difference (ITD) is a fundamental cue for human sound localization. Over the past decades several methods have been proposed for its estimation from measured head-related impulse response (HRIR) data. Nevertheless, inter-method variations in ITD calculation have been found to exceed the known just noticeable differences (JNDs), leading to possible perceptible artifacts in virtual binaural auditory scenes, when personalized HRIRs are being used. In the absence of an objective means for validating ITD estimations, this paper examines which methods lead to the most perceptually relevant results. A subjective lateralization study compared objective ITDs to perceptually evaluated inter-aural pure delay offsets. Results clearly indicate the first-onset threshold detection method, using a low relative threshold of  30 dB, applied on 3 kHz low-pass filtered HRIRs as consistently the most perceptually relevant procedure across various metrics. Several alternative threshold values and methods based on the maximum or centroid of the inter-aural cross correlation of similarly filtered HRIR or HRIR envelopes also provided reasonable results. On the contrary, phase-based methods employing the integrated relative group delay or auditory model were not found to perform as well.
We report the advantages and drawbacks of three protocols (adjustment, AFC-yes/no, and AFC-left/right discrimination protocols) evaluating the estimation of the interaural time differences (ITDs) for correct lateralization perception. The... more
We report the advantages and drawbacks of three protocols (adjustment, AFC-yes/no, and AFC-left/right discrimination protocols) evaluating the estimation of the interaural time differences (ITDs) for correct lateralization perception. The protocols were compared with respect to reliability of the perceived ITD and just noticeable difference (JND) estimations, robustness to participants' errors, effect of the number of tested blocks of trials, and test duration. Binaural stimuli were employed, including all spatial cues for sound sources placed at horizontal positions of 30° and 90°. All three protocols yielded comparable perceived ITD but different JNDs at 30°. At 90°, both the AFC-left/right and the adjustment protocols were more problematic than the yes/no protocol. Overall, only the yes/no protocol fulfilled the requirements of this study, i.e. a quick protocol that can be used for all angles.