Skip to main content

Oswaldo Ludwig

KU Leuven, Computerscience, Alumnus

Followers

37

Following

21

Co-authors

8

Mentions

1

Public Views

Oswaldo Ludwig holds a Ph.D. degree in Electrical and Computer Engineering from the University of Coimbra, Portugal, and a postdoctoral fellowship in the Department of Computer Science of the KU Leuven, Belgium. His research interests include Machine Learning with applications in Natural Language Processing, Automatic Speech Recognition and Computer Vision.

less

InterestsView All (6)

Uploads

Papers by Oswaldo Ludwig

COMPRESSING WAV2VEC2 FOR EMBEDDED APPLICATIONS

Wav2vec2 self-supervised multilingual training learns speech units common to multiple languages, ... more Wav2vec2 self-supervised multilingual training learns speech units common to multiple languages, leading to better generalization capacity. However, Wav2vec2 is larger than other E2E ASR models such as the Conformer ASR. Therefore, the objective of this work is to reduce the Wav2vec footprint by pruning lines from the intermediate dense layers of the encoder block, since they represent about two thirds of the encoder parameters. We apply Genetic Algorithms (GA) to solve the combinatorial optimization problem associated with pruning, which means running many copies of the Wav2vec2 decoder in parallel using multiprocessing on a computer grid, so an effort was made to optimize the GA for good performance with few CPUs. The experiments show a small absolute word error rate damage of 0.21% (1.26% relative) for a pruning of 40% and compare this value with those of the usual L1-norm pruning and model restructuring by singular value decomposition.

CLASSP: a Biologically-Inspired Approach to Continual Learning through Adjustment Suppression and Sparsity Promotion

This paper introduces a new biologically-inspired training method named Continual Learning throug... more This paper introduces a new biologically-inspired training method named Continual Learning through Adjustment Suppression and Sparsity Promotion (CLASSP). CLASSP is based on two main principles observed in neuroscience, particularly in the context of synaptic transmission and Long-Term Potentiation (LTP). The first principle is a decay rate over the weight adjustment, which is implemented as a generalization of the AdaGrad optimization algorithm. This means that weights that have received many updates should have lower learning rates as they likely encode important information about previously seen data. However, this principle results in a diffuse distribution of updates throughout the model, as it promotes updates for weights that haven't been previously updated, while a sparse update distribution is preferred to leave weights unassigned for future tasks. Therefore, the second principle introduces a threshold on the loss gradient. This promotes sparse learning by updating a weight only if the loss gradient with respect to that weight is above a certain threshold, i.e. only updating weights with a significant impact on the current loss. Both principles reflect phenomena observed in LTP, where a threshold effect and a gradual saturation of potentiation have been observed. CLASSP is implemented in a Python/PyTorch class, making it applicable to any model. When compared with Elastic Weight Consolidation (EWC) using Computer Vision datasets, CLASSP demonstrates superior performance in terms of accuracy and memory footprint.

Training Cascades of Seq2seq Models

Oswaldoludwig/Seq2Seq-Chatbot-For-Keras: Seq2Seq Chatbot For Keras

A new seq2seq model of chatbot.

Mapping from written stories to virtual reality

Discussion about paper Unsupervised Speech Recognition

A method for maximizing mutual information between segment representations and the generated sequ... more

End-to-end Adversarial Learning for Generative Conversational Agents

This paper presents a new adversarial learning method for generative conversational agents (GCA) ... more This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations s...

Oswaldoludwig/Adversarial-Learning-For-Generative-Conversational-Agents: Adversarial Learning For Generative Conversational Agents

This repository presents a new adversarial learning method for generative conversational agents (... more This repository presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Moreover, the adversarial training also yields a trained discriminator that can be used to select the best answer, when dierent mod...

Deep Embedding for Spatial Role Labeling

oswaldoludwig/Parallel-Seq2Seq: Parallel-Seq2Seq

A new seq2seq model for fast training.

End-to-end Adversarial Learning for Generative Conversational Agents

ArXiv, 2017

This paper presents a new adversarial learning method for generative conversational agents (GCA) ... more This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations s...

Comparison study of redundant sensor fusion in autonomous vehicle navigation

Improving the Generalization Properties of Neural Networks: an Application to Vehicle Detection

2008 11th International IEEE Conference on Intelligent Transportation Systems, 2008

ABSTRACT In this paper a multilayer feedforward neural network based approach for vehicle detecti... more ABSTRACT In this paper a multilayer feedforward neural network based approach for vehicle detection is proposed. The main idea is to use such network to perform both feature extraction and classification. This simplicity enables real time applications. In order to achieve such capabilities, the network is trained by a new algorithm, proposed in this paper, named minimization of inter-class interference (MCI). Such algorithm aims to create a hidden space (i.e. feature space) where the patterns have a desirable statistical distribution. Regarding the neural architecture, the linear output layer is replaced by the Mahalanobis kernel, in order to improve generalization. Experiments are performed by means of a dataset that includes two standard datasets from Caltech car rear. Finally, disturbed images are used, in order to evaluate the robustness of the neural-network based vehicle detection. The proposed method reveals low miss rate, low false alarm rate and high area under ROC curve. In Matlab environment, the algorithm spends only 3.280e-4 seconds per image. These facts encourage this research line.

LIDAR and vision-based pedestrian detection system

Journal of Field Robotics, 2009

Learning to Extract Action Descriptions from Narrative Text

IEEE Transactions on Computational Intelligence and AI in Games, 2017

Visualising the content of a text in a virtual world

Deep Learning with Eigenvalue Decay Regularizer

This paper extends our previous work on regularization of neural networks using Eigenvalue Decay ... more This paper extends our previous work on regularization of neural networks using Eigenvalue Decay by employing a soft approximation of the dominant eigenvalue in order to enable the calculation of its derivatives in relation to the synaptic weights, and therefore the application of back-propagation, which is a primary demand for deep learning. Moreover, we extend our previous theoretical analysis to deep neural networks and multiclass classification problems. Our method is implemented as an additional regularizer in Keras, a modular neural networks library written in Python, and evaluated in the benchmark data sets Reuters Newswire Topics Classification, IMDB database for binary sentiment classification, MNIST database of handwritten digits and CIFAR-10 data set for image classification.

Nonlinear Control Based On Differential Geometry

Arcs, 2008

Redes Neurais: Fundamentos e aplicações com programas em C

End-to-end Adversarial Learning for Generative Conversational Agents

arXiv [cs.CL], 2017

This paper presents a new adversarial learning method for generative conversational agents (GCA) ... more This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations show that the adversarial method yields significant performance gains over the usual teacher forcing training.

COMPRESSING WAV2VEC2 FOR EMBEDDED APPLICATIONS

Wav2vec2 self-supervised multilingual training learns speech units common to multiple languages, ... more Wav2vec2 self-supervised multilingual training learns speech units common to multiple languages, leading to better generalization capacity. However, Wav2vec2 is larger than other E2E ASR models such as the Conformer ASR. Therefore, the objective of this work is to reduce the Wav2vec footprint by pruning lines from the intermediate dense layers of the encoder block, since they represent about two thirds of the encoder parameters. We apply Genetic Algorithms (GA) to solve the combinatorial optimization problem associated with pruning, which means running many copies of the Wav2vec2 decoder in parallel using multiprocessing on a computer grid, so an effort was made to optimize the GA for good performance with few CPUs. The experiments show a small absolute word error rate damage of 0.21% (1.26% relative) for a pruning of 40% and compare this value with those of the usual L1-norm pruning and model restructuring by singular value decomposition.

CLASSP: a Biologically-Inspired Approach to Continual Learning through Adjustment Suppression and Sparsity Promotion

This paper introduces a new biologically-inspired training method named Continual Learning throug... more This paper introduces a new biologically-inspired training method named Continual Learning through Adjustment Suppression and Sparsity Promotion (CLASSP). CLASSP is based on two main principles observed in neuroscience, particularly in the context of synaptic transmission and Long-Term Potentiation (LTP). The first principle is a decay rate over the weight adjustment, which is implemented as a generalization of the AdaGrad optimization algorithm. This means that weights that have received many updates should have lower learning rates as they likely encode important information about previously seen data. However, this principle results in a diffuse distribution of updates throughout the model, as it promotes updates for weights that haven't been previously updated, while a sparse update distribution is preferred to leave weights unassigned for future tasks. Therefore, the second principle introduces a threshold on the loss gradient. This promotes sparse learning by updating a weight only if the loss gradient with respect to that weight is above a certain threshold, i.e. only updating weights with a significant impact on the current loss. Both principles reflect phenomena observed in LTP, where a threshold effect and a gradual saturation of potentiation have been observed. CLASSP is implemented in a Python/PyTorch class, making it applicable to any model. When compared with Elastic Weight Consolidation (EWC) using Computer Vision datasets, CLASSP demonstrates superior performance in terms of accuracy and memory footprint.

Training Cascades of Seq2seq Models

Oswaldoludwig/Seq2Seq-Chatbot-For-Keras: Seq2Seq Chatbot For Keras

A new seq2seq model of chatbot.

Mapping from written stories to virtual reality

Discussion about paper Unsupervised Speech Recognition

A method for maximizing mutual information between segment representations and the generated sequ... more

End-to-end Adversarial Learning for Generative Conversational Agents

This paper presents a new adversarial learning method for generative conversational agents (GCA) ... more This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations s...

Oswaldoludwig/Adversarial-Learning-For-Generative-Conversational-Agents: Adversarial Learning For Generative Conversational Agents

This repository presents a new adversarial learning method for generative conversational agents (... more This repository presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Moreover, the adversarial training also yields a trained discriminator that can be used to select the best answer, when dierent mod...

Deep Embedding for Spatial Role Labeling

oswaldoludwig/Parallel-Seq2Seq: Parallel-Seq2Seq

A new seq2seq model for fast training.

End-to-end Adversarial Learning for Generative Conversational Agents

ArXiv, 2017

This paper presents a new adversarial learning method for generative conversational agents (GCA) ... more This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations s...

Comparison study of redundant sensor fusion in autonomous vehicle navigation

Improving the Generalization Properties of Neural Networks: an Application to Vehicle Detection

2008 11th International IEEE Conference on Intelligent Transportation Systems, 2008

ABSTRACT In this paper a multilayer feedforward neural network based approach for vehicle detecti... more ABSTRACT In this paper a multilayer feedforward neural network based approach for vehicle detection is proposed. The main idea is to use such network to perform both feature extraction and classification. This simplicity enables real time applications. In order to achieve such capabilities, the network is trained by a new algorithm, proposed in this paper, named minimization of inter-class interference (MCI). Such algorithm aims to create a hidden space (i.e. feature space) where the patterns have a desirable statistical distribution. Regarding the neural architecture, the linear output layer is replaced by the Mahalanobis kernel, in order to improve generalization. Experiments are performed by means of a dataset that includes two standard datasets from Caltech car rear. Finally, disturbed images are used, in order to evaluate the robustness of the neural-network based vehicle detection. The proposed method reveals low miss rate, low false alarm rate and high area under ROC curve. In Matlab environment, the algorithm spends only 3.280e-4 seconds per image. These facts encourage this research line.

LIDAR and vision-based pedestrian detection system

Journal of Field Robotics, 2009

Learning to Extract Action Descriptions from Narrative Text

IEEE Transactions on Computational Intelligence and AI in Games, 2017

Visualising the content of a text in a virtual world

Deep Learning with Eigenvalue Decay Regularizer

This paper extends our previous work on regularization of neural networks using Eigenvalue Decay ... more This paper extends our previous work on regularization of neural networks using Eigenvalue Decay by employing a soft approximation of the dominant eigenvalue in order to enable the calculation of its derivatives in relation to the synaptic weights, and therefore the application of back-propagation, which is a primary demand for deep learning. Moreover, we extend our previous theoretical analysis to deep neural networks and multiclass classification problems. Our method is implemented as an additional regularizer in Keras, a modular neural networks library written in Python, and evaluated in the benchmark data sets Reuters Newswire Topics Classification, IMDB database for binary sentiment classification, MNIST database of handwritten digits and CIFAR-10 data set for image classification.

Nonlinear Control Based On Differential Geometry

Arcs, 2008

Redes Neurais: Fundamentos e aplicações com programas em C

End-to-end Adversarial Learning for Generative Conversational Agents

arXiv [cs.CL], 2017

This paper presents a new adversarial learning method for generative conversational agents (GCA) ... more This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs token-level classification, i.e. it indicates whether the current token was generated by humans or machines. To do so, the discriminator also receives the context utterances (the dialogue history) and the incomplete answer up to the current token as input. This new approach makes possible the end-to-end training by backpropagation. A self-conversation process enables to produce a set of generated data with more diversity for the adversarial training. This approach improves the performance on questions not related to the training data. Experimental results with human and adversarial evaluations show that the adversarial method yields significant performance gains over the usual teacher forcing training.

Discussion about paper Unsupervised Speech Recognition

A method for maximizing mutual information between segment representations and the generated sequ... more

Training Cascades of Seq2seq Models

This position paper addresses hierarchical multi-task learning (HMTL) in the seq2seq context, i.e... more This position paper addresses hierarchical multi-task learning (HMTL) in the seq2seq context, i.e. a specific type of multi-task learning in which tasks are hierarchically related in terms of level of abstraction. As an example of HMTL, we can mention the pipeline of end2end Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) tasks, in which the ASR model takes waveforms as input and generates the corresponding textual transcription, while the NLU model learns to map this textual information to a structured representation according to the context/application. Back-propagating errors in a cascade of seq2seq models isn't simple, due to the sampling operations performed during decoding, which aren't differentiable, requiring approximations for the argmax operation. Therefore, this document proposes a training method in which each seq2seq block is trained separately, but aware of the errors of the next seq2seq block(s).