Models for supervised learning in sequence data
More Info
expand_more
Abstract
Much of the observational data that we see around is, is ordered in space or time. For instance, video data, audio data or text data. This ordered data, called sequence data, calls for automatic analysis using supervised learning. Traditional single-observation supervised learning is challenged by sequence data, because (1) the length of sequence examples is often variable; (2) the sequence data may contain irrelevant segments which yield negative impact on the learning performance and (3) there exist temporal dependencies between consecutive observations in a sequence that need to be exploited by supervised learning on sequence data. This thesis introduces new models for supervised learning on sequence data that specifically address these challenges.
We first propose a sequence classification model which is a graphical model using hidden variables to model the latent structure in the sequence data. It advances the state-of-the art by using the same number of hidden variables to model much more complex decision boundaries. Subsequently,we present a sequence classification model which is able to dealwith unsegmented sequences. The proposed model integrates ideas from attention models and gated recurrent neural networks. It is able to discern the salient segments and filter out the irrelevant ones, but it also measures the relevance of each time step of the sequence data to the final task. Finally, we propose an end-to-end model for age estimation from facial expression videos that performs feature learning and supervised learning for the final task jointly.
Next we considered the supervised learning on paired sequences in which we want to predict whether the two sequences are similar. We combined ideas from sequence modeling and metric learning, and propose Siamese Recurrent Networks to learn a good similarity measure between two sequences. Our model is superior to current techniques that are based on handcrafted similarity measures or models using unsupervised learning. Finally, we present a model that predicts the preference of users for items in a recommendation system. In this case, two input sequences represent a pair of historic user and item data each with their own properties. The dependencies between the two sequences are modeled using an attention scheme.