HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

Petermann, Darius; Beack, Seungkwon; Kim, Minje

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2107.10843 (eess)

[Submitted on 22 Jul 2021 (v1), last revised 23 Jul 2021 (this version, v2)]

Title:HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

Authors:Darius Petermann, Seungkwon Beack, Minje Kim

View PDF

Abstract:An autoencoder-based codec employs quantization to turn its bottleneck layer activation into bitstrings, a process that hinders information flow between the encoder and decoder parts. To circumvent this issue, we employ additional skip connections between the corresponding pair of encoder-decoder layers. The assumption is that, in a mirrored autoencoder topology, a decoder layer reconstructs the intermediate feature representation of its corresponding encoder layer. Hence, any additional information directly propagated from the corresponding encoder layer helps the reconstruction. We implement this kind of skip connections in the form of additional autoencoders, each of which is a small codec that compresses the massive data transfer between the paired encoder-decoder layers. We empirically verify that the proposed hyper-autoencoded architecture improves perceptual audio quality compared to an ordinary autoencoder baseline.

Comments:	Accepted to the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2021, Mohonk Mountain House, New Paltz, NY
Subjects:	Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
Cite as:	arXiv:2107.10843 [eess.AS]
	(or arXiv:2107.10843v2 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2107.10843

Submission history

From: Darius Petermann [view email]
[v1] Thu, 22 Jul 2021 17:57:53 UTC (7,178 KB)
[v2] Fri, 23 Jul 2021 14:33:04 UTC (7,178 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:HARP-Net: Hyper-Autoencoded Reconstruction Propagation for Scalable Neural Audio Coding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators