Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–7 of 7 results for author: Caillon, A

.
  1. arXiv:2301.12662  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    SingSong: Generating musical accompaniments from singing

    Authors: Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse Engel

    Abstract: We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice. To accomplish this, we build on recent developments in musical source separation and audio generation. Specifically, we apply a state-of-the-art source separation algorithm to a large corpus… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

  2. arXiv:2301.11325  [pdf, other

    cs.SD cs.LG eess.AS

    MusicLM: Generating Music From Text

    Authors: Andrea Agostinelli, Timo I. Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, Antoine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, Matt Sharifi, Neil Zeghidour, Christian Frank

    Abstract: We introduce MusicLM, a model generating high-fidelity music from text descriptions such as "a calming violin melody backed by a distorted guitar riff". MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes. Our experiments show that MusicLM outperforms previous s… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: Supplementary material at https://google-research.github.io/seanet/musiclm/examples and https://kaggle.com/datasets/googleai/musiccaps

  3. arXiv:2204.07064  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Streamable Neural Audio Synthesis With Non-Causal Convolutions

    Authors: Antoine Caillon, Philippe Esling

    Abstract: Deep learning models are mostly used in an offline inference fashion. However, this strongly limits the use of these models inside audio generation setups, as most creative workflows are based on real-time digital signal processing. Although approaches based on recurrent networks can be naturally adapted to this buffer-based computation, the use of convolutions still poses some serious challenges.… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

  4. arXiv:2111.05011  [pdf, other

    cs.LG cs.SD eess.AS

    RAVE: A variational autoencoder for fast and high-quality neural audio synthesis

    Authors: Antoine Caillon, Philippe Esling

    Abstract: Deep generative models applied to audio have improved by a large margin the state-of-the-art in many speech and music related tasks. However, as raw waveform modelling remains an inherently difficult task, audio generative models are either computationally intensive, rely on low sampling rates, are complicated to control or restrict the nature of possible signals. Among those models, Variational A… ▽ More

    Submitted 15 December, 2021; v1 submitted 9 November, 2021; originally announced November 2021.

  5. arXiv:2008.01370  [pdf

    cs.SD cs.LG eess.AS

    Timbre latent space: exploration and creative aspects

    Authors: Antoine Caillon, Adrien Bitton, Brice Gatinet, Philippe Esling

    Abstract: Recent studies show the ability of unsupervised models to learn invertible audio representations using Auto-Encoders. They enable high-quality sound synthesis but a limited control since the latent spaces do not disentangle timbre properties. The emergence of disentangled representations was studied in Variational Auto-Encoders (VAEs), and has been applied to audio. Using an additional perceptual… ▽ More

    Submitted 17 August, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

  6. arXiv:2007.16170  [pdf, other

    cs.LG cs.MM cs.SD eess.AS stat.ML

    Diet deep generative audio models with structured lottery

    Authors: Philippe Esling, Ninon Devis, Adrien Bitton, Antoine Caillon, Axel Chemla--Romeu-Santos, Constance Douwes

    Abstract: Deep learning models have provided extremely successful solutions in most audio application fields. However, the high accuracy of these models comes at the expense of a tremendous computation cost. This aspect is almost always overlooked in evaluating the quality of proposed models. However, models should not be evaluated without taking into account their complexity. This aspect is especially crit… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: 8 pages, 5 figures. Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8-12, 2020

  7. arXiv:1904.06215  [pdf, other

    cs.SD cs.LG eess.AS

    Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders

    Authors: Adrien Bitton, Philippe Esling, Antoine Caillon, Martin Fouilleul

    Abstract: Generative models have thrived in computer vision, enabling unprecedented image processes. Yet the results in audio remain less advanced. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including semantic controls that can be adapted to different sound libraries and specific tags. These generative variables should allow expressive modulations of target mu… ▽ More

    Submitted 22 June, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: this article has been accepted for presentation to the 22nd International Conference on Digital Audio Effects (DAFx 2019) ; we provide additional content on this companion repository https://github.com/acids-ircam/Expressive_WAE_FADER