Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–10 of 10 results for author: Panagakis, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2311.17968  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    Latent Alignment with Deep Set EEG Decoders

    Authors: Stylianos Bakas, Siegfried Ludwig, Dimitrios A. Adamos, Nikolaos Laskaris, Yannis Panagakis, Stefanos Zafeiriou

    Abstract: The variability in EEG signals between different individuals poses a significant challenge when implementing brain-computer interfaces (BCI). Commonly proposed solutions to this problem include deep learning models, due to their increased capacity and generalization, as well as explicit domain adaptation techniques. Here, we introduce the Latent Alignment method that won the Benchmarks for EEG Tra… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    ACM Class: I.2.6

  2. arXiv:2309.11140  [pdf, other

    cs.SD cs.LG eess.AS

    Investigating Personalization Methods in Text to Music Generation

    Authors: Manos Plitsis, Theodoros Kouzelis, Georgios Paraskevopoulos, Vassilis Katsouros, Yannis Panagakis

    Abstract: In this work, we investigate the personalization of text-to-music diffusion models in a few-shot setting. Motivated by recent advances in the computer vision domain, we are the first to explore the combination of pre-trained text-to-audio diffusers with two established personalization methods. We experiment with the effect of audio-specific data augmentation on the overall system performance and a… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024, Examples at https://zelaki.github.io/

  3. arXiv:2307.16584  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Audio-visual video-to-speech synthesis with synthesized input audio

    Authors: Triantafyllos Kefalas, Yannis Panagakis, Maja Pantic

    Abstract: Video-to-speech synthesis involves reconstructing the speech signal of a speaker from a silent video. The implicit assumption of this task is that the sound signal is either missing or contains a high amount of noise/corruption such that it is not useful for processing. Previous works in the literature either use video inputs only or employ both video and audio inputs during training, and discard… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  4. arXiv:2306.15464  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Large-scale unsupervised audio pre-training for video-to-speech synthesis

    Authors: Triantafyllos Kefalas, Yannis Panagakis, Maja Pantic

    Abstract: Video-to-speech synthesis is the task of reconstructing the speech signal from a silent video of a speaker. Most established approaches to date involve a two-step process, whereby an intermediate representation from the video, such as a spectrogram, is extracted first and then passed to a vocoder to produce the raw audio. Some recent work has focused on end-to-end synthesis, whereby the generation… ▽ More

    Submitted 31 July, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Corrected typos. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  5. arXiv:2303.05582  [pdf, other

    cs.LG cs.IR cs.IT eess.SP

    Generalization analysis of an unfolding network for analysis-based Compressed Sensing

    Authors: Vicky Kouni, Yannis Panagakis

    Abstract: Unfolding networks have shown promising results in the Compressed Sensing (CS) field. Yet, the investigation of their generalization ability is still in its infancy. In this paper, we perform generalization analysis of a state-of-the-art ADMM-based unfolding network, which jointly learns a decoder for CS and a sparsifying redundant analysis operator. To this end, we first impose a structural const… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

  6. arXiv:2208.02089  [pdf, other

    cs.CV cs.LG eess.IV

    Unsupervised Discovery of Semantic Concepts in Satellite Imagery with Style-based Wavelet-driven Generative Models

    Authors: Nikos Kostagiolas, Mihalis A. Nicolaou, Yannis Panagakis

    Abstract: In recent years, considerable advancements have been made in the area of Generative Adversarial Networks (GANs), particularly with the advent of style-based architectures that address many key shortcomings - both in terms of modeling capabilities and network interpretability. Despite these improvements, the adoption of such approaches in the domain of satellite imagery is not straightforward. Typi… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: 11 pages, 5 figures, accepted at SETN 2022

  7. arXiv:2202.12950  [pdf, other

    eess.SP cs.AI cs.LG

    2021 BEETL Competition: Advancing Transfer Learning for Subject Independence & Heterogenous EEG Data Sets

    Authors: Xiaoxi Wei, A. Aldo Faisal, Moritz Grosse-Wentrup, Alexandre Gramfort, Sylvain Chevallier, Vinay Jayaram, Camille Jeunet, Stylianos Bakas, Siegfried Ludwig, Konstantinos Barmpas, Mehdi Bahri, Yannis Panagakis, Nikolaos Laskaris, Dimitrios A. Adamos, Stefanos Zafeiriou, William C. Duong, Stephen M. Gordon, Vernon J. Lawhern, Maciej ƚliwowski, Vincent Rouanne, Piotr Tempczyk

    Abstract: Transfer learning and meta-learning offer some of the most promising avenues to unlock the scalability of healthcare and consumer technologies driven by biosignal data. This is because current methods cannot generalise well across human subjects' data and handle learning from different heterogeneously collected data sets, thus limiting the scale of training data. On the other side, developments in… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

    Comments: PrePrint of the NeurIPS2021 BEETL Competition Submitted to Proceedings of Machine Learning Research (PMLR)

  8. arXiv:2202.03267  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    Team Cogitat at NeurIPS 2021: Benchmarks for EEG Transfer Learning Competition

    Authors: Stylianos Bakas, Siegfried Ludwig, Konstantinos Barmpas, Mehdi Bahri, Yannis Panagakis, Nikolaos Laskaris, Dimitrios A. Adamos, Stefanos Zafeiriou

    Abstract: Building subject-independent deep learning models for EEG decoding faces the challenge of strong covariate-shift across different datasets, subjects and recording sessions. Our approach to address this difficulty is to explicitly align feature distributions at various layers of the deep learning model, using both simple statistical techniques as well as trainable methods with more representational… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    ACM Class: I.2.6

  9. arXiv:1912.05833  [pdf, other

    cs.LG eess.AS stat.ML

    Speech-driven facial animation using polynomial fusion of features

    Authors: Triantafyllos Kefalas, Konstantinos Vougioukas, Yannis Panagakis, Stavros Petridis, Jean Kossaifi, Maja Pantic

    Abstract: Speech-driven facial animation involves using a speech signal to generate realistic videos of talking faces. Recent deep learning approaches to facial synthesis rely on extracting low-dimensional representations and concatenating them, followed by a decoding step of the concatenated vector. This accounts for only first-order interactions of the features and ignores higher-order interactions. In th… ▽ More

    Submitted 19 February, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

  10. arXiv:1906.06196  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Factorized Higher-Order CNNs with an Application to Spatio-Temporal Emotion Estimation

    Authors: Jean Kossaifi, Antoine Toisoul, Adrian Bulat, Yannis Panagakis, Timothy Hospedales, Maja Pantic

    Abstract: Training deep neural networks with spatio-temporal (i.e., 3D) or multidimensional convolutions of higher-order is computationally challenging due to millions of unknown parameters across dozens of layers. To alleviate this, one approach is to apply low-rank tensor decompositions to convolution kernels in order to compress the network and reduce its number of parameters. Alternatively, new convolut… ▽ More

    Submitted 31 March, 2020; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: IEEE CVPR 2020