Abstract A ective Computing is a newly emerging eld which has been de ned as \computing that rela... more Abstract A ective Computing is a newly emerging eld which has been de ned as \computing that relates to, arises from, or deliberately in uences emotions." Many applications in a ective computing research rely on, or can greatly bene t from, information regarding a ective states of ...
International Conference on Acoustics, Speech, and Signal Processing, 1997
In this work, inspired by the application of human-machine interaction and the potential use that... more In this work, inspired by the application of human-machine interaction and the potential use that human-computer interfaces can make of knowledge regarding the affective state of a user, we investigate the problem of sensing and recognizing typical affective experiences that arise when people communicate with computers. In particular, we address the problem of detecting “frustration” in human computer interfaces. By first sensing human biophysiological correlates of internal affective states, we proceed to stochastically model the biological time series with hidden Markov models to obtain user-dependent recognition systems that learn affective patterns from a set of training data. Labeling criteria to classify the data are discussed, and generalization of the results to a set of unobserved data is evaluated. Significant recognition results (greater than random) are reported for 21 of 24 subjects
INTRODUCTION Affective computing, a new area of computing research, has been described as comput... more INTRODUCTION Affective computing, a new area of computing research, has been described as computing which relates to, arises from, or deliberately influences emotions [5]. Why build an affective computer? Recent evidence demonstrates that humans have an inherent ...
We explore the use of features derived from multiresolution analysis of speech and the Teager ene... more We explore the use of features derived from multiresolution analysis of speech and the Teager energy operator for classification of drivers’ speech under stressed conditions. We apply this set of features to a database of short speech utterances to create user-dependent discriminants of four stress categories. In addition we address the problem of choosing a suitable temporal scale for representing categorical differences in the data. This leads to two modeling approaches. In the first approach, the dynamics of the feature set within the utterance are assumed to be important for the classification task. These features are then classified using dynamic Bayesian network models as well as a model consisting of a mixture of hidden Markov models (M-HMM). In the second approach, we define an utterance-level feature set by taking the mean value of the features across the utterance. This feature set is then modeled with a support vector machine and a multilayer perceptron classifier. We compare the performance on the sparser and full dynamic representations against a chance-level performance of 25% and obtain the best performance with the speaker-dependent mixture model (96.4% on the training set, and 61.2% on a separate testing set). We also investigate how these models perform on the speaker-independent task. Although the performance of the speaker-independent models degrades with respect to the models trained on individual speakers, the mixture model still outperforms the competing models and achieves significantly better than random recognition (80.4% on the training set, and 51.2% on a separate testing set).In diesem Bericht untersuchen wir die Verwendung von Merkmalen der Sprachanalyse aufgrund mehrerer Zeitskalen zur Klassifikation der Sprache eines Fahrers unter Stress. Diese Merkmale wenden wir auf eine Datenbank kurzer Sprachsequenzen an, um vier sprecherabhängige Stresskategorien zu erstellen. Zusätzlich beschäftigen wir uns mit der Auswahl der passenden Zeitskala für die Repräsentation klassenspezifischer Unterschiede in der Datenmenge. Dies führt zu zwei unterschiedlichen Modellierungsansätzen. Im ersten Ansatz wird vorausgesetzt, dass die dynamische Entwicklung der Merkmale, die innerhalb der Sprachsequenzen vorhanden ist, wichtig für die Klassifizierung ist. Diese Merkmale werden klassifiziert mit Hilfe von dynamischen Bayes Netzen bzw. mit einer Mischung von Hidden Markov Modellen. Im zweiten Ansatz definieren wir Merkmale auf Artikulationsebene, indem wir die Mittelwerte der Merkmale über die Sprachsequenzen berechnen. Diese Merkmalsmenge wird dann mit einer Support Vector Maschine und einem Multilayer-Perzeptron modelliert. Die Performanz der spärlichen und der voll dynamischen Darstellung wird verglichen mit dem zufälligen Klassifikationsniveau von 25%. Das beste Ergebnis erhalten wir hierbei mit einem sprecherabhängigen Mischmodell (96,4% auf den Trainingsdaten und 61,2% auf unabhängigen Testdaten). Weiterhin untersuchen wir, wie diese Modelle bei sprecherunabhängigen Aufgaben abschneiden. Obwohl die sprecherunabhängigen Modelle schlechter abschneiden als die auf einzelne Sprecher justierten Modellen, übertrifft das Mischmodell die konkurrierenden Modelle immer noch und erzielt signifikant bessere Sprechererkennung als Zufallserkennung (80.4% auf den Trainingsdaten und 51.2% auf unabhängigen Testdaten).Nous explorons l’utilisation de représentations dérivées de l’analyse multirésolution de la parole et de l’opérateur d’énergie de Teager pour la classification de la parole de conducteurs en condition de stress. Nous appliquons cette analyse à corpus d’énoncés courts pour créer des fonctions discriminantes dépendantes du locuteur pour quatre catégories de stress. En outre nous adressons le problème du choix d’une échelle temporelle appropriée pour catégoriser les données. Ceci mène à deux approches pour la modélisation. Dans la première approche, la dynamique des variables issues de l’analyse d’un énoncé donné est supposée pertinente pour la classification. Ces variables sont alors modélisées au moyen de réseaux bayésiens dynamiques (DBN) ou par un mélange des modèles de Markov cachés (M-HMM). Pour la seconde approche, nous ne gardons que les valeurs moyennes de ces variables pour chaque énoncé. Le vecteur résultant est alors modélisé au moyen d’une machine à support de vecteur et d’un perceptron multicouches. Nous comparons les performances de ces deux approches à un tirage aléatoire (25%), les meilleurs résultats étant obtenus avec le mélange de modèles dépendant du locuteur (96,44% sur les données d’apprentissage, et 61,20% sur un jeu de test distinct). Nous étudions également les performances de modèles indépendants du locuteur. Bien que les performances se dégradent par rapport des modèles spécifiques aux locuteurs, le mélange de modèles surpasse encore les autres modèles et obtient un taux de reconnaissance sensiblement meilleur qu’un tirage aléatoire (80,42 sur les données d’apprentissage, et 51,22% sur le jeu de test).
Magnetic anchoring guidance systems (MAGS) are composed of an internal surgical instrument contro... more Magnetic anchoring guidance systems (MAGS) are composed of an internal surgical instrument controlled by an external handheld magnet and do not require a dedicated surgical port. Therefore, this system may help to reduce internal and external collision of instruments associated with laparoendoscopic single-site (LESS) surgery. Herein, we describe the initial clinical experience with a magnetically anchored camera system used during laparoscopic nephrectomy and appendectomy in two human patients. Two separate cases were performed using a single-incision working port with the addition of a magnetically anchored camera that was controlled externally with a magnet. Surgery was successful in both cases. Nephrectomy was completed in 120 min with 150 ml estimated blood loss (EBL) and the patient was discharged home on postoperative day 2. Appendectomy was successfully completed in 55 min with EBL of 10 ml and the patient was discharged home the following morning. Use of a MAGS camera results in fewer instrument collisions, improves surgical working space, and provides an image comparable to that in standard laparoscopy.
In this work we develop and apply a class of hierarchical directed graphical models on the task o... more In this work we develop and apply a class of hierarchical directed graphical models on the task of recognizing affective categories from prosody in both acted and natural speech. A strength of this new approach is the integration and summarization of information using both local (e.g., syllable level) and global prosodic phenomena (e.g., utterance level). In this framework speech is
Evolution of minimally invasive techniques has prompted interest in natural orifice transluminal ... more Evolution of minimally invasive techniques has prompted interest in natural orifice transluminal endoscopic surgery (NOTES). Challenges for NOTES include loss of instrument rigidity, reduction in working envelopes, and collision of instrumentation. Magnetic anchoring and guidance system (MAGS) is one surgical innovation developed at our institution whereby instruments that are deployed intra-abdominally are maneuvered by the use of an external magnet. We present our initial animal experience with complete transvaginal NOTES nephrectomy using MAGS technology. Transvaginal NOTES nephrectomy was performed in two female pigs through a vaginotomy, using a 40-cm dual-lumen rigid access port inserted into the peritoneal cavity. A MAGS camera and cauterizer were deployed through the port and manipulated across the peritoneal surface by way of magnetic coupling via an external magnet. A prototype 70-cm articulating laparoscopic grasper introduced through the vaginal access port facilitated dissection after deployment of the MAGS instruments. The renal artery and vein were stapled en-bloc using an extra-long articulating endovascular stapler. NOTES nephrectomies were successfully completed in both pigs without complications using MAGS instrumentation. The MAGS camera provided a conventional umbilical perspective of the kidney; the cauterizer, transvaginal grasper, and stapler preserved triangulation while avoiding instrument collisions. Operative duration for the two cases was 155 and 125 minutes, and blood loss was minimal. NOTES nephrectomy using MAGS instrumentation is feasible. We believe this approach improves shortcomings of previously reported NOTES nephrectomies in that triangulation, instrument fidelity, and visualization are preserved while hilar ligation is performed using a conventional stapler without need for additional transabdominal trocars.
We developed a prototype magnetic tool for ureteroscopic extraction of magnetized stone particles... more We developed a prototype magnetic tool for ureteroscopic extraction of magnetized stone particles. We compared its efficiency for retrieving magnetized calcium oxalate monohydrate stone particles with that of a conventional nitinol basket from the pelvi-collecting system of a bench top ureteroscopic simulator. Iron oxide microparticles were successfully bound to 1 to 1.5, 1.5 to 2 and 2 to 2.5 mm human calcium oxalate monohydrate stones. Several coated fragments of each size were implanted in the collecting system of a bench top ureteroscopic simulator. Five-minute timed stone extraction trials were performed for each fragment size using a back loaded 8Fr magnetic tool mounted on a 0.038-inch guidewire or a conventional basket. The median number of fragments retrieved per timed trial was compared for the magnetic tool vs the basket using the Mann-Whitney U test. For 1 to 1.5 mm fragments the median number retrieved within 5 minutes was significantly higher for the prototype magnetic tool than for the nitinol basket (9.5 vs 3.5, p = 0.03). For 1.5 to 2 mm fragments the magnetic tool was more efficient but the difference in the number of fragments retrieved was not statistically significant (9.5 vs 4.5, p = 0.19). For 2 to 2.5 mm fragments there was no difference between the instruments in the number retrieved (6 per group, p = 1.0). The prototype magnetic tool improved the efficiency of retrieving stone particles rendered paramagnetic that were less than 2 mm but showed no advantage for larger fragments. This system has the potential to decrease the number of small retained fragments after ureteroscopic lithotripsy.
Using a deliberately slow computer–game-interface to induce a state of hypothesised frustration i... more Using a deliberately slow computer–game-interface to induce a state of hypothesised frustration in users, we collected physiological, video and behavioural data, and developed a strategy for coupling these data with real-world events. The effectiveness of our strategy was tested in a study with thirty six subjects, where the system was shown to reliably synchronise and gather data for affect analysis. A pattern-recognition strategy known as Hidden Markov Models was applied to each subject's physiological signals of skin conductivity and blood volume pressure in an effort to see if regimes of likely frustration could be automatically discriminated from regimes when frustration was much less likely. This pattern-recognition approach performed significantly better than random guessing at classifying the two regimes. Mouse-clicking behaviour was also synchronised to frustration-eliciting events and analysed, revealing four distinct patterns of clicking responses. We provide recommendations and guidelines for using physiology as a dependent measure for HCI experiments, especially when considering human emotions in the HCI equation.
Abstract A ective Computing is a newly emerging eld which has been de ned as \computing that rela... more Abstract A ective Computing is a newly emerging eld which has been de ned as \computing that relates to, arises from, or deliberately in uences emotions." Many applications in a ective computing research rely on, or can greatly bene t from, information regarding a ective states of ...
International Conference on Acoustics, Speech, and Signal Processing, 1997
In this work, inspired by the application of human-machine interaction and the potential use that... more In this work, inspired by the application of human-machine interaction and the potential use that human-computer interfaces can make of knowledge regarding the affective state of a user, we investigate the problem of sensing and recognizing typical affective experiences that arise when people communicate with computers. In particular, we address the problem of detecting “frustration” in human computer interfaces. By first sensing human biophysiological correlates of internal affective states, we proceed to stochastically model the biological time series with hidden Markov models to obtain user-dependent recognition systems that learn affective patterns from a set of training data. Labeling criteria to classify the data are discussed, and generalization of the results to a set of unobserved data is evaluated. Significant recognition results (greater than random) are reported for 21 of 24 subjects
INTRODUCTION Affective computing, a new area of computing research, has been described as comput... more INTRODUCTION Affective computing, a new area of computing research, has been described as computing which relates to, arises from, or deliberately influences emotions [5]. Why build an affective computer? Recent evidence demonstrates that humans have an inherent ...
We explore the use of features derived from multiresolution analysis of speech and the Teager ene... more We explore the use of features derived from multiresolution analysis of speech and the Teager energy operator for classification of drivers’ speech under stressed conditions. We apply this set of features to a database of short speech utterances to create user-dependent discriminants of four stress categories. In addition we address the problem of choosing a suitable temporal scale for representing categorical differences in the data. This leads to two modeling approaches. In the first approach, the dynamics of the feature set within the utterance are assumed to be important for the classification task. These features are then classified using dynamic Bayesian network models as well as a model consisting of a mixture of hidden Markov models (M-HMM). In the second approach, we define an utterance-level feature set by taking the mean value of the features across the utterance. This feature set is then modeled with a support vector machine and a multilayer perceptron classifier. We compare the performance on the sparser and full dynamic representations against a chance-level performance of 25% and obtain the best performance with the speaker-dependent mixture model (96.4% on the training set, and 61.2% on a separate testing set). We also investigate how these models perform on the speaker-independent task. Although the performance of the speaker-independent models degrades with respect to the models trained on individual speakers, the mixture model still outperforms the competing models and achieves significantly better than random recognition (80.4% on the training set, and 51.2% on a separate testing set).In diesem Bericht untersuchen wir die Verwendung von Merkmalen der Sprachanalyse aufgrund mehrerer Zeitskalen zur Klassifikation der Sprache eines Fahrers unter Stress. Diese Merkmale wenden wir auf eine Datenbank kurzer Sprachsequenzen an, um vier sprecherabhängige Stresskategorien zu erstellen. Zusätzlich beschäftigen wir uns mit der Auswahl der passenden Zeitskala für die Repräsentation klassenspezifischer Unterschiede in der Datenmenge. Dies führt zu zwei unterschiedlichen Modellierungsansätzen. Im ersten Ansatz wird vorausgesetzt, dass die dynamische Entwicklung der Merkmale, die innerhalb der Sprachsequenzen vorhanden ist, wichtig für die Klassifizierung ist. Diese Merkmale werden klassifiziert mit Hilfe von dynamischen Bayes Netzen bzw. mit einer Mischung von Hidden Markov Modellen. Im zweiten Ansatz definieren wir Merkmale auf Artikulationsebene, indem wir die Mittelwerte der Merkmale über die Sprachsequenzen berechnen. Diese Merkmalsmenge wird dann mit einer Support Vector Maschine und einem Multilayer-Perzeptron modelliert. Die Performanz der spärlichen und der voll dynamischen Darstellung wird verglichen mit dem zufälligen Klassifikationsniveau von 25%. Das beste Ergebnis erhalten wir hierbei mit einem sprecherabhängigen Mischmodell (96,4% auf den Trainingsdaten und 61,2% auf unabhängigen Testdaten). Weiterhin untersuchen wir, wie diese Modelle bei sprecherunabhängigen Aufgaben abschneiden. Obwohl die sprecherunabhängigen Modelle schlechter abschneiden als die auf einzelne Sprecher justierten Modellen, übertrifft das Mischmodell die konkurrierenden Modelle immer noch und erzielt signifikant bessere Sprechererkennung als Zufallserkennung (80.4% auf den Trainingsdaten und 51.2% auf unabhängigen Testdaten).Nous explorons l’utilisation de représentations dérivées de l’analyse multirésolution de la parole et de l’opérateur d’énergie de Teager pour la classification de la parole de conducteurs en condition de stress. Nous appliquons cette analyse à corpus d’énoncés courts pour créer des fonctions discriminantes dépendantes du locuteur pour quatre catégories de stress. En outre nous adressons le problème du choix d’une échelle temporelle appropriée pour catégoriser les données. Ceci mène à deux approches pour la modélisation. Dans la première approche, la dynamique des variables issues de l’analyse d’un énoncé donné est supposée pertinente pour la classification. Ces variables sont alors modélisées au moyen de réseaux bayésiens dynamiques (DBN) ou par un mélange des modèles de Markov cachés (M-HMM). Pour la seconde approche, nous ne gardons que les valeurs moyennes de ces variables pour chaque énoncé. Le vecteur résultant est alors modélisé au moyen d’une machine à support de vecteur et d’un perceptron multicouches. Nous comparons les performances de ces deux approches à un tirage aléatoire (25%), les meilleurs résultats étant obtenus avec le mélange de modèles dépendant du locuteur (96,44% sur les données d’apprentissage, et 61,20% sur un jeu de test distinct). Nous étudions également les performances de modèles indépendants du locuteur. Bien que les performances se dégradent par rapport des modèles spécifiques aux locuteurs, le mélange de modèles surpasse encore les autres modèles et obtient un taux de reconnaissance sensiblement meilleur qu’un tirage aléatoire (80,42 sur les données d’apprentissage, et 51,22% sur le jeu de test).
Magnetic anchoring guidance systems (MAGS) are composed of an internal surgical instrument contro... more Magnetic anchoring guidance systems (MAGS) are composed of an internal surgical instrument controlled by an external handheld magnet and do not require a dedicated surgical port. Therefore, this system may help to reduce internal and external collision of instruments associated with laparoendoscopic single-site (LESS) surgery. Herein, we describe the initial clinical experience with a magnetically anchored camera system used during laparoscopic nephrectomy and appendectomy in two human patients. Two separate cases were performed using a single-incision working port with the addition of a magnetically anchored camera that was controlled externally with a magnet. Surgery was successful in both cases. Nephrectomy was completed in 120 min with 150 ml estimated blood loss (EBL) and the patient was discharged home on postoperative day 2. Appendectomy was successfully completed in 55 min with EBL of 10 ml and the patient was discharged home the following morning. Use of a MAGS camera results in fewer instrument collisions, improves surgical working space, and provides an image comparable to that in standard laparoscopy.
In this work we develop and apply a class of hierarchical directed graphical models on the task o... more In this work we develop and apply a class of hierarchical directed graphical models on the task of recognizing affective categories from prosody in both acted and natural speech. A strength of this new approach is the integration and summarization of information using both local (e.g., syllable level) and global prosodic phenomena (e.g., utterance level). In this framework speech is
Evolution of minimally invasive techniques has prompted interest in natural orifice transluminal ... more Evolution of minimally invasive techniques has prompted interest in natural orifice transluminal endoscopic surgery (NOTES). Challenges for NOTES include loss of instrument rigidity, reduction in working envelopes, and collision of instrumentation. Magnetic anchoring and guidance system (MAGS) is one surgical innovation developed at our institution whereby instruments that are deployed intra-abdominally are maneuvered by the use of an external magnet. We present our initial animal experience with complete transvaginal NOTES nephrectomy using MAGS technology. Transvaginal NOTES nephrectomy was performed in two female pigs through a vaginotomy, using a 40-cm dual-lumen rigid access port inserted into the peritoneal cavity. A MAGS camera and cauterizer were deployed through the port and manipulated across the peritoneal surface by way of magnetic coupling via an external magnet. A prototype 70-cm articulating laparoscopic grasper introduced through the vaginal access port facilitated dissection after deployment of the MAGS instruments. The renal artery and vein were stapled en-bloc using an extra-long articulating endovascular stapler. NOTES nephrectomies were successfully completed in both pigs without complications using MAGS instrumentation. The MAGS camera provided a conventional umbilical perspective of the kidney; the cauterizer, transvaginal grasper, and stapler preserved triangulation while avoiding instrument collisions. Operative duration for the two cases was 155 and 125 minutes, and blood loss was minimal. NOTES nephrectomy using MAGS instrumentation is feasible. We believe this approach improves shortcomings of previously reported NOTES nephrectomies in that triangulation, instrument fidelity, and visualization are preserved while hilar ligation is performed using a conventional stapler without need for additional transabdominal trocars.
We developed a prototype magnetic tool for ureteroscopic extraction of magnetized stone particles... more We developed a prototype magnetic tool for ureteroscopic extraction of magnetized stone particles. We compared its efficiency for retrieving magnetized calcium oxalate monohydrate stone particles with that of a conventional nitinol basket from the pelvi-collecting system of a bench top ureteroscopic simulator. Iron oxide microparticles were successfully bound to 1 to 1.5, 1.5 to 2 and 2 to 2.5 mm human calcium oxalate monohydrate stones. Several coated fragments of each size were implanted in the collecting system of a bench top ureteroscopic simulator. Five-minute timed stone extraction trials were performed for each fragment size using a back loaded 8Fr magnetic tool mounted on a 0.038-inch guidewire or a conventional basket. The median number of fragments retrieved per timed trial was compared for the magnetic tool vs the basket using the Mann-Whitney U test. For 1 to 1.5 mm fragments the median number retrieved within 5 minutes was significantly higher for the prototype magnetic tool than for the nitinol basket (9.5 vs 3.5, p = 0.03). For 1.5 to 2 mm fragments the magnetic tool was more efficient but the difference in the number of fragments retrieved was not statistically significant (9.5 vs 4.5, p = 0.19). For 2 to 2.5 mm fragments there was no difference between the instruments in the number retrieved (6 per group, p = 1.0). The prototype magnetic tool improved the efficiency of retrieving stone particles rendered paramagnetic that were less than 2 mm but showed no advantage for larger fragments. This system has the potential to decrease the number of small retained fragments after ureteroscopic lithotripsy.
Using a deliberately slow computer–game-interface to induce a state of hypothesised frustration i... more Using a deliberately slow computer–game-interface to induce a state of hypothesised frustration in users, we collected physiological, video and behavioural data, and developed a strategy for coupling these data with real-world events. The effectiveness of our strategy was tested in a study with thirty six subjects, where the system was shown to reliably synchronise and gather data for affect analysis. A pattern-recognition strategy known as Hidden Markov Models was applied to each subject's physiological signals of skin conductivity and blood volume pressure in an effort to see if regimes of likely frustration could be automatically discriminated from regimes when frustration was much less likely. This pattern-recognition approach performed significantly better than random guessing at classifying the two regimes. Mouse-clicking behaviour was also synchronised to frustration-eliciting events and analysed, revealing four distinct patterns of clicking responses. We provide recommendations and guidelines for using physiology as a dependent measure for HCI experiments, especially when considering human emotions in the HCI equation.
Uploads
Papers by Raul Fernandez