Enrico Marchetto | Università degli Studi di Padova - Academia.edu

Skip to main content

Enrico Marchetto

Università degli Studi di Padova, Dipartimento di Ingegneria dell'Informazione (DEI), Post-Doc

Followers

88

Following

16

Co-authors

3

Public Views

InterestsView All (7)

Uploads

Papers

A set of audio features for the morphological description of vocal imitations

In our current project, vocal signal has to be used to drive sound synthesis. In order to study t... more In our current project, vocal signal has to be used to drive sound synthesis. In order to study the mapping between voice and synthesis parameters, the inverse problem is first studied. A set of reference synthesizer sounds have been created and each sound has been imitated by a large number of people. Each reference synthesizer sound belongs to one of the six following morphological categories: " up " , " down " , " up/down " , " impulse " , " repetition " , " stable ". The goal of this paper is to study the automatic estimation of these morphological categories from the vocal imitations. We propose three approaches for this. A base-line system is first introduced. It uses standard audio descriptors as inputs for a continuous Hidden Markov Model (HMM) and provides an accuracy of 55.1%. To improve this, we propose a set of slope descriptors which, converted into symbols, are used as input for a discrete HMM. This system reaches 70.8% accuracy. The recognition performance has been further increased by developing specific compact audio descriptors that directly highlight the morphological aspects of sounds instead of relying on HMM. This system allows reaching the highest accuracy: 83.6%.

Automatic Speaker Recognition and Characterization by means of Robust Vocal Source Features

Modellazione fisica della glottide e inversione acustico-articolatoria

Questo lavoro presenta una tecnica per la stima del modello a due masse della corda vocale a part... more Questo lavoro presenta una tecnica per la stima del modello a due masse della corda vocale a partire da un dato flusso glottale tempo-variante. Il modello a due massee specificato da un certo numero di parametri meccanici di basso livello, calcolati in funzione di quattro parametri articolatori (livelli di attivazione di tre muscoli laringali e pressione subglottale). Le forme d'onda del flusso glottale, sintetizzate dal modello, sono caratterizzate da un insieme di parametri acustici per la quantificazione della sorgente vocale. ...

Estimation of a Physical Model of the Vocal Folds Via Dynamic Programming Techniques

Abstract: This work presents a procedure for the estimation of a two-mass vocal fold model starti... more Abstract: This work presents a procedure for the estimation of a two-mass vocal fold model starting from a time-varying target ow signal. The model is specified by a large number of physical parameters, computed as functions of four articulatory parameters (three laryngeal muscle activations and subglottal pressure). Flow waveforms synthesized by the model are characterized by means of a set of typical voice source quantification acoustic parameters. Given a sequences of target acoustic parameters, dynamic programming techniques and ...

An automatic speaker recognition system for intelligence applications

by Enrico Marchetto and Federico Avanzini

Proceedings of European …, Jan 1, 2009

Sistema per il controllo della Voice Quality nella sintesi del parlato emotivo

Inversione di un modello fisico dell'apparato fonatorio mediante programmazione dinamica e reti RBF.

Automatic speaker recognition by means of robust vocal source features

Control of voice quality for emotional speech synthesis

Proceedings of …, Jan 1, 2004

An automatic speaker recognition system for intelligence applications

by Enrico Marchetto and F. Flego

Proceedings of European …, Jan 1, 2009

A spectral subtraction rule for real-time DSP implementation of noise reduction in speech signals

by Enrico Marchetto and Matteo Romanin

A set of audio features for the morphological description of vocal imitations

In our current project, vocal signal has to be used to drive sound synthesis. In order to study t... more In our current project, vocal signal has to be used to drive sound synthesis. In order to study the mapping between voice and synthesis parameters, the inverse problem is first studied. A set of reference synthesizer sounds have been created and each sound has been imitated by a large number of people. Each reference synthesizer sound belongs to one of the six following morphological categories: " up " , " down " , " up/down " , " impulse " , " repetition " , " stable ". The goal of this paper is to study the automatic estimation of these morphological categories from the vocal imitations. We propose three approaches for this. A base-line system is first introduced. It uses standard audio descriptors as inputs for a continuous Hidden Markov Model (HMM) and provides an accuracy of 55.1%. To improve this, we propose a set of slope descriptors which, converted into symbols, are used as input for a discrete HMM. This system reaches 70.8% accuracy. The recognition performance has been further increased by developing specific compact audio descriptors that directly highlight the morphological aspects of sounds instead of relying on HMM. This system allows reaching the highest accuracy: 83.6%.

Automatic Speaker Recognition and Characterization by means of Robust Vocal Source Features

Modellazione fisica della glottide e inversione acustico-articolatoria

Questo lavoro presenta una tecnica per la stima del modello a due masse della corda vocale a part... more Questo lavoro presenta una tecnica per la stima del modello a due masse della corda vocale a partire da un dato flusso glottale tempo-variante. Il modello a due massee specificato da un certo numero di parametri meccanici di basso livello, calcolati in funzione di quattro parametri articolatori (livelli di attivazione di tre muscoli laringali e pressione subglottale). Le forme d'onda del flusso glottale, sintetizzate dal modello, sono caratterizzate da un insieme di parametri acustici per la quantificazione della sorgente vocale. ...

Estimation of a Physical Model of the Vocal Folds Via Dynamic Programming Techniques

Abstract: This work presents a procedure for the estimation of a two-mass vocal fold model starti... more Abstract: This work presents a procedure for the estimation of a two-mass vocal fold model starting from a time-varying target ow signal. The model is specified by a large number of physical parameters, computed as functions of four articulatory parameters (three laryngeal muscle activations and subglottal pressure). Flow waveforms synthesized by the model are characterized by means of a set of typical voice source quantification acoustic parameters. Given a sequences of target acoustic parameters, dynamic programming techniques and ...

An automatic speaker recognition system for intelligence applications

by Enrico Marchetto and Federico Avanzini

Proceedings of European …, Jan 1, 2009

Sistema per il controllo della Voice Quality nella sintesi del parlato emotivo

Inversione di un modello fisico dell'apparato fonatorio mediante programmazione dinamica e reti RBF.

Automatic speaker recognition by means of robust vocal source features

Control of voice quality for emotional speech synthesis

Proceedings of …, Jan 1, 2004

An automatic speaker recognition system for intelligence applications

by Enrico Marchetto and F. Flego

Proceedings of European …, Jan 1, 2009

A spectral subtraction rule for real-time DSP implementation of noise reduction in speech signals

by Enrico Marchetto and Matteo Romanin