Acoustic scene analysis based on hierarchical generative model of acoustic event sequence
K Imoto, S Shimauchi - IEICE TRANSACTIONS on Information and …, 2016 - search.ieice.org
K Imoto, S Shimauchi
IEICE TRANSACTIONS on Information and Systems, 2016•search.ieice.orgWe propose a novel method for estimating acoustic scenes such as user activities,
eg,“cooking,”“vacuuming,”“watching TV,” or situations, eg,“being on the bus,”“being in a
park,”“meeting,” utilizing the information of acoustic events. There are some methods for
estimating acoustic scenes that associate a combination of acoustic events with an acoustic
scene. However, the existing methods cannot adequately express acoustic scenes,
eg,“cooking,” that have more than one subordinate category, eg,“frying ingredients” or …
eg,“cooking,”“vacuuming,”“watching TV,” or situations, eg,“being on the bus,”“being in a
park,”“meeting,” utilizing the information of acoustic events. There are some methods for
estimating acoustic scenes that associate a combination of acoustic events with an acoustic
scene. However, the existing methods cannot adequately express acoustic scenes,
eg,“cooking,” that have more than one subordinate category, eg,“frying ingredients” or …
We propose a novel method for estimating acoustic scenes such as user activities, e.g., “cooking,” “vacuuming,” “watching TV,” or situations, e.g., “being on the bus,” “being in a park,” “meeting,” utilizing the information of acoustic events. There are some methods for estimating acoustic scenes that associate a combination of acoustic events with an acoustic scene. However, the existing methods cannot adequately express acoustic scenes, e.g., “cooking,” that have more than one subordinate category, e.g., “frying ingredients” or “plating food,” because they directly associate acoustic events with acoustic scenes. In this paper, we propose an acoustic scene estimation method based on a hierarchical probabilistic generative model of an acoustic event sequence taking into account the relation among acoustic scenes, their subordinate categories, and acoustic event sequences. In the proposed model, each acoustic scene is represented as a probability distribution over their unsupervised subordinate categories, called “acoustic sub-topics,” and each acoustic sub-topic is represented as a probability distribution over acoustic events. Acoustic scene estimation experiments with real-life sounds showed that the proposed method could correctly extract subordinate categories of acoustic scenes.
search.ieice.org