Electrical Engineering and Systems Science > Audio and Speech Processing
[Submitted on 19 Jun 2024 (v1), last revised 8 Jul 2024 (this version, v2)]
Title:Pushing the Limit of Sound Event Detection with Multi-Dilated Frequency Dynamic Convolution
View PDF HTML (experimental)Abstract:Frequency dynamic convolution (FDY conv) has been a milestone in the sound event detection (SED) field, but it involves a substantial increase in model size due to multiple basis kernels. In this work, we propose partial frequency dynamic convolution (PFD conv), which concatenates static convolution output and dynamic FDY conv output in order to minimize model size increase while maintaining the performance. Additionally, we propose multi-dilated frequency dynamic convolution (MDFD conv), which integrates multiple dilated frequency dynamic convolution (DFD conv) branches with different dilation size sets and a static branch within a single convolution module, achieving a 3.17% improvement in polyphonic sound detection score (PSDS) over FDY conv. Proposed methods with extensive ablation studies further enhance understanding and usability of FDY conv variants.
Submission history
From: Hyeonuk Nam [view email][v1] Wed, 19 Jun 2024 08:02:02 UTC (249 KB)
[v2] Mon, 8 Jul 2024 02:32:58 UTC (253 KB)
Current browse context:
eess.AS
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.