Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 68 results for author: Mori, G

.
  1. arXiv:2405.13956  [pdf, other

    cs.LG

    Attention as an RNN

    Authors: Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Mohamed Osama Ahmed, Yoshua Bengio, Greg Mori

    Abstract: The advent of Transformers marked a significant breakthrough in sequence modelling, providing a highly performant architecture capable of leveraging GPU parallelism. However, Transformers are computationally expensive at inference time, limiting their applications, particularly in low-resource settings (e.g., mobile and embedded devices). Addressing this, we (1) begin by showing that attention can… ▽ More

    Submitted 28 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2402.10392  [pdf, other

    cs.LG cs.AI

    Pretext Training Algorithms for Event Sequence Data

    Authors: Yimu Wang, He Zhao, Ruizhi Deng, Frederick Tung, Greg Mori

    Abstract: Pretext training followed by task-specific fine-tuning has been a successful approach in vision and language domains. This paper proposes a self-supervised pretext training framework tailored to event sequence data. We introduce a novel alignment verification task that is specialized to event sequences, building on good practices in masked reconstruction and contrastive learning. Our pretext tasks… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  3. arXiv:2402.01955  [pdf, other

    cs.LG cs.AI math.FA

    OPSurv: Orthogonal Polynomials Quadrature Algorithm for Survival Analysis

    Authors: Lilian W. Bialokozowicz, Hoang M. Le, Tristan Sylvain, Peter A. I. Forsyth, Vineel Nagisetty, Greg Mori

    Abstract: This paper introduces the Orthogonal Polynomials Quadrature Algorithm for Survival Analysis (OPSurv), a new method providing time-continuous functional outputs for both single and competing risks scenarios in survival analysis. OPSurv utilizes the initial zero condition of the Cumulative Incidence function and a unique decomposition of probability densities using orthogonal polynomials, allowing i… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    MSC Class: 68W25 (Primary); 65Z05 (Secondary) ACM Class: I.2.0; J.3

  4. arXiv:2210.11566  [pdf, other

    cs.CV cs.LG

    Rethinking Learning Approaches for Long-Term Action Anticipation

    Authors: Megha Nawhal, Akash Abdu Jyothi, Greg Mori

    Abstract: Action anticipation involves predicting future actions having observed the initial portion of a video. Typically, the observed video is processed as a whole to obtain a video-level representation of the ongoing activity in the video, which is then used for future prediction. We introduce ANTICIPATR which performs long-term action anticipation leveraging segment-level representations learned using… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted at ECCV'22. Project page: http://meghanawhal.github.io/projects/anticipatr.html

  5. arXiv:2209.00173  [pdf, other

    cs.LG

    Continuous-time Particle Filtering for Latent Stochastic Differential Equations

    Authors: Ruizhi Deng, Greg Mori, Andreas M. Lehrmann

    Abstract: Particle filtering is a standard Monte-Carlo approach for a wide range of sequential inference tasks. The key component of a particle filter is a set of particles with importance weights that serve as a proxy of the true posterior distribution of some stochastic process. In this work, we propose continuous latent particle filters, an approach that extends particle filtering to the continuous-time… ▽ More

    Submitted 31 August, 2022; originally announced September 2022.

  6. arXiv:2205.15236  [pdf, other

    cs.LG cs.AI cs.CV

    RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression

    Authors: Yu Gong, Greg Mori, Frederick Tung

    Abstract: Data imbalance, in which a plurality of the data samples come from a small proportion of labels, poses a challenge in training deep neural networks. Unlike classification, in regression the labels are continuous, potentially boundless, and form a natural ordering. These distinct features of regression call for new techniques that leverage the additional information encoded in label-space relations… ▽ More

    Submitted 24 June, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: Accepted to ICML 2022

  7. arXiv:2205.08247  [pdf, other

    cs.LG cs.AI

    Monotonicity Regularization: Improved Penalties and Novel Applications to Disentangled Representation Learning and Robust Classification

    Authors: Joao Monteiro, Mohamed Osama Ahmed, Hossein Hajimirsadeghi, Greg Mori

    Abstract: We study settings where gradient penalties are used alongside risk minimization with the goal of obtaining predictors satisfying different notions of monotonicity. Specifically, we present two sets of contributions. In the first part of the paper, we show that different choices of penalties define the regions of the input space where the property is observed. As such, previous methods result in mo… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: Accepted to UAI 2022

  8. arXiv:2202.00368  [pdf, other

    cs.CV cs.LG

    Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space

    Authors: Steeven Janny, Fabien Baradel, Natalia Neverova, Madiha Nadri, Greg Mori, Christian Wolf

    Abstract: Learning causal relationships in high-dimensional data (images, videos) is a hard task, as they are often defined on low dimensional manifolds and must be extracted from complex signals dominated by appearance, lighting, textures and also spurious correlations in the data. We present a method for learning counterfactual reasoning of physical processes in pixel space, which requires the prediction… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    Journal ref: International Conference on Learning Representation (2022)

  9. arXiv:2110.12606  [pdf, other

    cs.CV

    MUSE: Feature Self-Distillation with Mutual Information and Self-Information

    Authors: Yu Gong, Ye Yu, Gaurav Mittal, Greg Mori, Mei Chen

    Abstract: We present a novel information-theoretic approach to introduce dependency among features of a deep convolutional neural network (CNN). The core idea of our proposed method, called MUSE, is to combine MUtual information and SElf-information to jointly improve the expressivity of all features extracted from different layers in a CNN. We present two variants of the realization of MUSE -- Additive Inf… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

    Comments: The 32nd British Machine Vision Conference (BMVC 2021)

  10. arXiv:2108.08420  [pdf, other

    cs.CV

    D3D-HOI: Dynamic 3D Human-Object Interactions from Videos

    Authors: Xiang Xu, Hanbyul Joo, Greg Mori, Manolis Savva

    Abstract: We introduce D3D-HOI: a dataset of monocular videos with ground truth annotations of 3D object pose, shape and part motion during human-object interactions. Our dataset consists of several common articulated objects captured from diverse real-world scenes and camera viewpoints. Each manipulated object (e.g., microwave oven) is represented with a matching 3D parametric model. This data allows us to… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

  11. arXiv:2106.15580  [pdf, other

    cs.LG stat.ML

    Continuous Latent Process Flows

    Authors: Ruizhi Deng, Marcus A. Brubaker, Greg Mori, Andreas M. Lehrmann

    Abstract: Partial observations of continuous time-series dynamics at arbitrary time stamps exist in many disciplines. Fitting this type of data using statistical models with continuous dynamics is not only promising at an intuitive level but also has practical benefits, including the ability to generate continuous trajectories and to perform inference on previously unseen time stamps. Despite exciting progr… ▽ More

    Submitted 27 October, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: Accepted to NeurIPS 2021

  12. arXiv:2106.10656  [pdf, other

    cs.LG cs.SI stat.ML

    TD-GEN: Graph Generation With Tree Decomposition

    Authors: Hamed Shirzad, Hossein Hajimirsadeghi, Amir H. Abdi, Greg Mori

    Abstract: We propose TD-GEN, a graph generation framework based on tree decomposition, and introduce a reduced upper bound on the maximum number of decisions needed for graph generation. The framework includes a permutation invariant tree generation model which forms the backbone of graph generation. Tree nodes are supernodes, each representing a cluster of nodes in the graph. Graph nodes and edges are incr… ▽ More

    Submitted 23 February, 2022; v1 submitted 20 June, 2021; originally announced June 2021.

  13. arXiv:2104.11939  [pdf, other

    cs.CV

    Piggyback GAN: Efficient Lifelong Learning for Image Conditioned Generation

    Authors: Mengyao Zhai, Lei Chen, Jiawei He, Megha Nawhal, Frederick Tung, Greg Mori

    Abstract: Humans accumulate knowledge in a lifelong fashion. Modern deep neural networks, on the other hand, are susceptible to catastrophic forgetting: when adapted to perform new tasks, they often fail to preserve their performance on previously learned tasks. Given a sequence of tasks, a naive approach addressing catastrophic forgetting is to train a separate standalone model for each task, which scales… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: Accepted to ECCV 2020

  14. arXiv:2104.11931  [pdf, other

    cs.CV

    Adaptive Appearance Rendering

    Authors: Mengyao Zhai, Ruizhi Deng, Jiacheng Chen, Lei Chen, Zhiwei Deng, Greg Mori

    Abstract: We propose an approach to generate images of people given a desired appearance and pose. Disentangled representations of pose and appearance are necessary to handle the compound variability in the resulting generated images. Hence, we develop an approach based on intermediate representations of poses and appearance: our pose-guided appearance rendering network firstly encodes the targets' poses us… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: Accepted to BMVC 2018. arXiv admin note: substantial text overlap with arXiv:1712.01955

  15. arXiv:2103.09458  [pdf, other

    cs.CV

    Learning Discriminative Prototypes with Dynamic Time Warping

    Authors: Xiaobin Chang, Frederick Tung, Greg Mori

    Abstract: Dynamic Time Warping (DTW) is widely used for temporal data processing. However, existing methods can neither learn the discriminative prototypes of different classes nor exploit such prototypes for further analysis. We propose Discriminative Prototype DTW (DP-DTW), a novel method to learn class-specific discriminative prototypes for temporal recognition tasks. DP-DTW shows superior performance co… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

    Comments: CVPR'21 preview, 10 pages, 8 figures

  16. arXiv:2102.12679  [pdf, other

    cs.LG stat.ML

    Variational Selective Autoencoder: Learning from Partially-Observed Heterogeneous Data

    Authors: Yu Gong, Hossein Hajimirsadeghi, Jiawei He, Thibaut Durand, Greg Mori

    Abstract: Learning from heterogeneous data poses challenges such as combining data from various sources and of different types. Meanwhile, heterogeneous data are often associated with missingness in real-world applications due to heterogeneity and noise of input sources. In this work, we propose the variational selective autoencoder (VSAE), a general framework to learn representations from partially-observe… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: International Conference on Artificial Intelligence and Statistics (AISTATS) 2021

  17. arXiv:2101.08540  [pdf, other

    cs.CV cs.AI

    Activity Graph Transformer for Temporal Action Localization

    Authors: Megha Nawhal, Greg Mori

    Abstract: We introduce Activity Graph Transformer, an end-to-end learnable model for temporal action localization, that receives a video as input and directly predicts a set of action instances that appear in the video. Detecting and localizing action instances in untrimmed videos requires reasoning over multiple action instances in a video. The dominant paradigms in the literature process videos temporally… ▽ More

    Submitted 28 January, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: Project webpage: https://www.sfu.ca/~mnawhal/projects/agt.html; Code available at https://github.com/Nmegha2601/activitygraph_transformer

  18. arXiv:2012.04195  [pdf, other

    cs.RO cs.AI cs.LG

    Neural fidelity warping for efficient robot morphology design

    Authors: Sha Hu, Zeshi Yang, Greg Mori

    Abstract: We consider the problem of optimizing a robot morphology to achieve the best performance for a target task, under computational resource limitations. The evaluation process for each morphological design involves learning a controller for the design, which can consume substantial time and computational resources. To address the challenge of expensive robot morphology evaluation, we present a contin… ▽ More

    Submitted 9 December, 2020; v1 submitted 7 December, 2020; originally announced December 2020.

  19. arXiv:2007.02919  [pdf, other

    cs.CV

    MCMI: Multi-Cycle Image Translation with Mutual Information Constraints

    Authors: Xiang Xu, Megha Nawhal, Greg Mori, Manolis Savva

    Abstract: We present a mutual information-based framework for unsupervised image-to-image translation. Our MCMI approach treats single-cycle image translation models as modules that can be used recurrently in a multi-cycle translation setting where the translation process is bounded by mutual information constraints between the input and output images. The proposed mutual information constraints can improve… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

  20. arXiv:2003.06988  [pdf, other

    cs.CV

    House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout Generation

    Authors: Nelson Nauata, Kai-Hung Chang, Chin-Yi Cheng, Greg Mori, Yasutaka Furukawa

    Abstract: This paper proposes a novel graph-constrained generative adversarial network, whose generator and discriminator are built upon relational architecture. The main idea is to encode the constraint into the graph structure of its relational networks. We have demonstrated the proposed architecture for a new house layout generation problem, whose task is to take an architectural constraint as a graph (i… ▽ More

    Submitted 15 March, 2020; originally announced March 2020.

  21. arXiv:2002.10516  [pdf, other

    cs.LG stat.ML

    Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows

    Authors: Ruizhi Deng, Bo Chang, Marcus A. Brubaker, Greg Mori, Andreas Lehrmann

    Abstract: Normalizing flows transform a simple base distribution into a complex target distribution and have proved to be powerful models for data generation and density estimation. In this work, we propose a novel type of normalizing flow driven by a differential deformation of the Wiener process. As a result, we obtain a rich time series model whose observable process inherits many of the appealing proper… ▽ More

    Submitted 13 July, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: Accepted to NeurIPS 2020

  22. arXiv:2002.10501  [pdf, other

    cs.LG stat.ML

    Variational Hyper RNN for Sequence Modeling

    Authors: Ruizhi Deng, Yanshuai Cao, Bo Chang, Leonid Sigal, Greg Mori, Marcus A. Brubaker

    Abstract: In this work, we propose a novel probabilistic sequence model that excels at capturing high variability in time series data, both across sequences and within an individual sequence. Our method uses temporal latent variables to capture information about the underlying data pattern and dynamically decodes the latent information into modifications of weights of the base decoder and recurrent model. T… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

  23. arXiv:2001.06538  [pdf, other

    cs.CV

    Adapting Grad-CAM for Embedding Networks

    Authors: Lei Chen, Jianhui Chen, Hossein Hajimirsadeghi, Greg Mori

    Abstract: The gradient-weighted class activation mapping (Grad-CAM) method can faithfully highlight important regions in images for deep model prediction in image classification, image captioning and many other tasks. It uses the gradients in back-propagation as weights (grad-weights) to explain network decisions. However, applying Grad-CAM to embedding networks raises significant challenges because embeddi… ▽ More

    Submitted 17 January, 2020; originally announced January 2020.

    Comments: WACV 2020 camera ready

  24. arXiv:1912.02401  [pdf, other

    cs.CV cs.LG eess.IV

    Generating Videos of Zero-Shot Compositions of Actions and Objects

    Authors: Megha Nawhal, Mengyao Zhai, Andreas Lehrmann, Leonid Sigal, Greg Mori

    Abstract: Human activity videos involve rich, varied interactions between people and objects. In this paper we develop methods for generating such videos -- making progress toward addressing the important, open problem of video generation in complex scenes. In particular, we introduce the task of generating human-object interaction videos in a zero-shot compositional setting, i.e., generating videos for act… ▽ More

    Submitted 17 July, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: Accepted at ECCV'20; Project Page: https://www.sfu.ca/~mnawhal/projects/zs_hoi_generation.html

  25. arXiv:1910.08281  [pdf, other

    cs.LG stat.ML

    Point Process Flows

    Authors: Nazanin Mehrasa, Ruizhi Deng, Mohamed Osama Ahmed, Bo Chang, Jiawei He, Thibaut Durand, Marcus Brubaker, Greg Mori

    Abstract: Event sequences can be modeled by temporal point processes (TPPs) to capture their asynchronous and probabilistic nature. We propose an intensity-free framework that directly models the point process distribution by utilizing normalizing flows. This approach is capable of capturing highly complex temporal distributions and does not rely on restrictive parametric forms. Comparisons with state-of-th… ▽ More

    Submitted 22 December, 2019; v1 submitted 18 October, 2019; originally announced October 2019.

  26. arXiv:1910.01743  [pdf, other

    cs.LG stat.ML

    Graph Generation with Variational Recurrent Neural Network

    Authors: Shih-Yang Su, Hossein Hajimirsadeghi, Greg Mori

    Abstract: Generating graph structures is a challenging problem due to the diverse representations and complex dependencies among nodes. In this paper, we introduce Graph Variational Recurrent Neural Network (GraphVRNN), a probabilistic autoregressive model for graph generation. Through modeling the latent variables of graph data, GraphVRNN can capture the joint distributions of graph structures and the unde… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

  27. arXiv:1909.13196  [pdf, other

    cs.LG cs.CV stat.ML

    Policy Message Passing: A New Algorithm for Probabilistic Graph Inference

    Authors: Zhiwei Deng, Greg Mori

    Abstract: A general graph-structured neural network architecture operates on graphs through two core components: (1) complex enough message functions; (2) a fixed information aggregation process. In this paper, we present the Policy Message Passing algorithm, which takes a probabilistic perspective and reformulates the whole information aggregation as stochastic sequential processes. The algorithm works on… ▽ More

    Submitted 28 September, 2019; originally announced September 2019.

  28. arXiv:1909.13165  [pdf, other

    cs.RO cs.AI cs.LG

    Relational Graph Learning for Crowd Navigation

    Authors: Changan Chen, Sha Hu, Payam Nikdel, Greg Mori, Manolis Savva

    Abstract: We present a relational graph learning approach for robotic crowd navigation using model-based deep reinforcement learning that plans actions by looking into the future. Our approach reasons about the relations between all agents based on their latent features and uses a Graph Convolutional Network to encode higher-order interactions in each agent's state representation, which is subsequently leve… ▽ More

    Submitted 3 August, 2020; v1 submitted 28 September, 2019; originally announced September 2019.

    Comments: Accepted to IROS 2020. Added links to codes and video demo

  29. arXiv:1909.12000  [pdf, other

    cs.CV

    CoPhy: Counterfactual Learning of Physical Dynamics

    Authors: Fabien Baradel, Natalia Neverova, Julien Mille, Greg Mori, Christian Wolf

    Abstract: Understanding causes and effects in mechanical systems is an essential component of reasoning in the physical world. This work poses a new problem of counterfactual learning of object mechanics from visual input. We develop the CoPhy benchmark to assess the capacity of the state-of-the-art models for causal physical reasoning in a synthetic 3D environment and propose a model for learning the physi… ▽ More

    Submitted 7 April, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: ICLR 2020 -Spotlight presentation

  30. arXiv:1908.02436  [pdf, other

    cs.LG cs.CV stat.ML

    Continuous Graph Flow

    Authors: Zhiwei Deng, Megha Nawhal, Lili Meng, Greg Mori

    Abstract: In this paper, we propose Continuous Graph Flow, a generative continuous flow based method that aims to model complex distributions of graph-structured data. Once learned, the model can be applied to an arbitrary graph, defining a probability density over the random variables represented by the graph. It is formulated as an ordinary differential equation system with shared and reusable functions t… ▽ More

    Submitted 28 September, 2019; v1 submitted 7 August, 2019; originally announced August 2019.

  31. arXiv:1907.10719  [pdf, other

    cs.CV

    LayoutVAE: Stochastic Scene Layout Generation From a Label Set

    Authors: Akash Abdu Jyothi, Thibaut Durand, Jiawei He, Leonid Sigal, Greg Mori

    Abstract: Recently there is an increasing interest in scene generation within the research community. However, models used for generating scene layouts from textual description largely ignore plausible visual variations within the structure dictated by the text. We propose LayoutVAE, a variational autoencoder based framework for generating stochastic scene layouts. LayoutVAE is a versatile modeling framewor… ▽ More

    Submitted 1 June, 2021; v1 submitted 24 July, 2019; originally announced July 2019.

    Comments: 20 pages, 24 figures, accepted in ICCV 2019

  32. arXiv:1907.10107  [pdf, other

    cs.CV

    Lifelong GAN: Continual Learning for Conditional Image Generation

    Authors: Mengyao Zhai, Lei Chen, Fred Tung, Jiawei He, Megha Nawhal, Greg Mori

    Abstract: Lifelong learning is challenging for deep neural networks due to their susceptibility to catastrophic forgetting. Catastrophic forgetting occurs when a trained network is not able to maintain its ability to accomplish previously learned tasks when it is trained to perform new tasks. We study the problem of lifelong learning for generative models, extending a trained network to new conditional gene… ▽ More

    Submitted 22 August, 2019; v1 submitted 23 July, 2019; originally announced July 2019.

    Comments: accepted to ICCV 2019

  33. arXiv:1907.09682  [pdf, other

    cs.CV

    Similarity-Preserving Knowledge Distillation

    Authors: Frederick Tung, Greg Mori

    Abstract: Knowledge distillation is a widely applicable technique for training a student neural network under the guidance of a trained teacher network. For example, in neural network compression, a high-capacity teacher is distilled to train a compact student; in privileged learning, a teacher trained with privileged data is distilled to train a student without access to that data. The distillation loss de… ▽ More

    Submitted 1 August, 2019; v1 submitted 22 July, 2019; originally announced July 2019.

    Comments: ICCV 2019 camera ready

  34. arXiv:1904.03273  [pdf, other

    cs.CV cs.LG

    A Variational Auto-Encoder Model for Stochastic Point Processes

    Authors: Nazanin Mehrasa, Akash Abdu Jyothi, Thibaut Durand, Jiawei He, Leonid Sigal, Greg Mori

    Abstract: We propose a novel probabilistic generative model for action sequences. The model is termed the Action Point Process VAE (APP-VAE), a variational auto-encoder that can capture the distribution over the times and categories of action sequences. Modeling the variety of possible action sequences is a challenge, which we show can be addressed via the APP-VAE's use of latent representations and non-lin… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: CVPR 19

  35. arXiv:1902.09720  [pdf, other

    cs.CV

    Learning a Deep ConvNet for Multi-label Classification with Partial Labels

    Authors: Thibaut Durand, Nazanin Mehrasa, Greg Mori

    Abstract: Deep ConvNets have shown great performance for single-label image classification (e.g. ImageNet), but it is necessary to move beyond the single-label classification task because pictures of everyday life are inherently multi-label. Multi-label classification is a more difficult task than single-label classification because both the input images and output label spaces are more complex. Furthermore… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: CVPR 2019

  36. arXiv:1808.04063  [pdf, other

    cs.CV

    Time Perception Machine: Temporal Point Processes for the When, Where and What of Activity Prediction

    Authors: Yatao Zhong, Bicheng Xu, Guang-Tong Zhou, Luke Bornn, Greg Mori

    Abstract: Numerous powerful point process models have been developed to understand temporal patterns in sequential data from fields such as health-care, electronic commerce, social networks, and natural disaster forecasting. In this paper, we develop novel models for learning the temporal distribution of human activities in streaming data (e.g., videos and person trajectories). We propose an integrated fram… ▽ More

    Submitted 14 August, 2018; v1 submitted 13 August, 2018; originally announced August 2018.

  37. arXiv:1806.06157  [pdf, other

    cs.CV

    Object Level Visual Reasoning in Videos

    Authors: Fabien Baradel, Natalia Neverova, Christian Wolf, Julien Mille, Greg Mori

    Abstract: Human activity recognition is typically addressed by detecting key concepts like global and local motion, features related to object classes present in the scene, as well as features related to the global context. The next open challenges in activity recognition require a level of understanding that pushes beyond this and call for models with capabilities for fine distinction and detailed comprehe… ▽ More

    Submitted 20 September, 2018; v1 submitted 15 June, 2018; originally announced June 2018.

    Comments: Accepted at ECCV 2018 - long version (16 pages + ref)

    Journal ref: ECCV 2018

  38. arXiv:1805.08916  [pdf, other

    stat.ML cs.LG

    Distribution Aware Active Learning

    Authors: Arash Mehrjou, Mehran Khodabandeh, Greg Mori

    Abstract: Discriminative learning machines often need a large set of labeled samples for training. Active learning (AL) settings assume that the learner has the freedom to ask an oracle to label its desired samples. Traditional AL algorithms heuristically choose query samples about which the current learner is uncertain. This strategy does not make good use of the structure of the dataset at hand and is pro… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  39. arXiv:1803.08085  [pdf, other

    cs.CV

    Probabilistic Video Generation using Holistic Attribute Control

    Authors: Jiawei He, Andreas Lehrmann, Joseph Marino, Greg Mori, Leonid Sigal

    Abstract: Videos express highly structured spatio-temporal patterns of visual data. A video can be thought of as being governed by two factors: (i) temporally invariant (e.g., person identity), or slowly varying (e.g., activity), attribute-induced appearance, encoding the persistent content of each frame, and (ii) an inter-frame motion or scene dynamics (e.g., encoding evolution of the person ex-ecuting the… ▽ More

    Submitted 21 March, 2018; originally announced March 2018.

  40. arXiv:1802.06459  [pdf, other

    cs.CV

    Structured Label Inference for Visual Understanding

    Authors: Nelson Nauata, Hexiang Hu, Guang-Tong Zhou, Zhiwei Deng, Zicheng Liao, Greg Mori

    Abstract: Visual data such as images and videos contain a rich source of structured semantic labels as well as a wide range of interacting components. Visual content could be assigned with fine-grained labels describing major components, coarse-grained labels depicting high level abstractions, or a set of labels revealing attributes. Such categorization over different, interacting layers of labels evinces t… ▽ More

    Submitted 18 February, 2018; originally announced February 2018.

  41. arXiv:1801.05895  [pdf, other

    cs.CV

    Sparsely Aggregated Convolutional Networks

    Authors: Ligeng Zhu, Ruizhi Deng, Michael Maire, Zhiwei Deng, Greg Mori, Ping Tan

    Abstract: We explore a key architectural aspect of deep convolutional neural networks: the pattern of internal skip connections used to aggregate outputs of earlier layers for consumption by deeper layers. Such aggregation is critical to facilitate training of very deep networks in an end-to-end manner. This is a primary reason for the widespread adoption of residual networks, which aggregate outputs via cu… ▽ More

    Submitted 7 February, 2019; v1 submitted 17 January, 2018; originally announced January 2018.

    Comments: Accepted to ECCV 2018

  42. arXiv:1712.01955  [pdf, other

    cs.CV

    Learning to Forecast Videos of Human Activity with Multi-granularity Models and Adaptive Rendering

    Authors: Mengyao Zhai, Jiacheng Chen, Ruizhi Deng, Lei Chen, Ligeng Zhu, Greg Mori

    Abstract: We propose an approach for forecasting video of complex human activity involving multiple people. Direct pixel-level prediction is too simple to handle the appearance variability in complex activities. Hence, we develop novel intermediate representations. An architecture combining a hierarchical temporal model for predicting human poses and encoder-decoder convolutional neural networks for renderi… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

  43. arXiv:1707.09102  [pdf, other

    cs.CV

    Fine-Pruning: Joint Fine-Tuning and Compression of a Convolutional Network with Bayesian Optimization

    Authors: Frederick Tung, Srikanth Muralidharan, Greg Mori

    Abstract: When approaching a novel visual recognition problem in a specialized image domain, a common strategy is to start with a pre-trained deep neural network and fine-tune it to the specialized domain. If the target domain covers a smaller visual space than the source domain used for pre-training (e.g. ImageNet), the fine-tuned network is likely to be over-parameterized. However, applying network prunin… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: BMVC 2017 oral

  44. arXiv:1706.05028  [pdf, ps, other

    cs.CV

    Hierarchical Label Inference for Video Classification

    Authors: Nelson Nauata, Jonathan Smith, Greg Mori

    Abstract: Videos are a rich source of high-dimensional structured data, with a wide range of interacting components at varying levels of granularity. In order to improve understanding of unconstrained internet videos, it is important to consider the role of labels at separate levels of abstraction. In this paper, we consider the use of the Bidirectional Inference Neural Network (BINN) for performing graph-b… ▽ More

    Submitted 21 January, 2018; v1 submitted 15 June, 2017; originally announced June 2017.

  45. arXiv:1706.02884  [pdf, other

    cs.CV

    Learning to Learn from Noisy Web Videos

    Authors: Serena Yeung, Vignesh Ramanathan, Olga Russakovsky, Liyue Shen, Greg Mori, Li Fei-Fei

    Abstract: Understanding the simultaneously very diverse and intricately fine-grained set of possible human actions is a critical open problem in computer vision. Manually labeling training videos is feasible for some action classes but doesn't scale to the full long-tailed distribution of actions. A promising way to address this is to leverage noisy data from web queries to learn new actions, using semi-sup… ▽ More

    Submitted 9 June, 2017; originally announced June 2017.

    Comments: To appear in CVPR 2017

  46. arXiv:1706.02342  [pdf, other

    cs.CV

    Active Learning for Structured Prediction from Partially Labeled Data

    Authors: Mehran Khodabandeh, Zhiwei Deng, Mostafa S. Ibrahim, Shinichi Satoh, Greg Mori

    Abstract: We propose a general purpose active learning algorithm for structured prediction, gathering labeled data for training a model that outputs a set of related labels for an image or video. Active learning starts with a limited initial training set, then iterates querying a user for labels on unlabeled data and retraining the model. We propose a novel algorithm for selecting data for labeling, choosin… ▽ More

    Submitted 9 June, 2017; v1 submitted 7 June, 2017; originally announced June 2017.

  47. arXiv:1706.00893  [pdf, other

    cs.CV

    Learning Person Trajectory Representations for Team Activity Analysis

    Authors: Nazanin Mehrasa, Yatao Zhong, Frederick Tung, Luke Bornn, Greg Mori

    Abstract: Activity analysis in which multiple people interact across a large space is challenging due to the interplay of individual actions and collective group dynamics. We propose an end-to-end approach for learning person trajectory representations for group activity analysis. The learned representations encode rich spatio-temporal dependencies and capture useful motion patterns for recognizing individu… ▽ More

    Submitted 2 June, 2017; originally announced June 2017.

  48. arXiv:1705.10861  [pdf, other

    cs.CV

    Generic Tubelet Proposals for Action Localization

    Authors: Jiawei He, Mostafa S. Ibrahim, Zhiwei Deng, Greg Mori

    Abstract: We develop a novel framework for action localization in videos. We propose the Tube Proposal Network (TPN), which can generate generic, class-independent, video-level tubelet proposals in videos. The generated tubelet proposals can be utilized in various video analysis tasks, including recognizing and localizing actions in videos. In particular, we integrate these generic tubelet proposals into a… ▽ More

    Submitted 30 May, 2017; originally announced May 2017.

  49. arXiv:1703.09891  [pdf, other

    cs.CV cs.AI cs.LG

    LabelBank: Revisiting Global Perspectives for Semantic Segmentation

    Authors: Hexiang Hu, Zhiwei Deng, Guang-Tong Zhou, Fei Sha, Greg Mori

    Abstract: Semantic segmentation requires a detailed labeling of image pixels by object category. Information derived from local image patches is necessary to describe the detailed shape of individual objects. However, this information is ambiguous and can result in noisy labels. Global inference of image content can instead capture the general semantic concepts present. We advocate that holistic inference o… ▽ More

    Submitted 29 March, 2017; originally announced March 2017.

    Comments: Pre-prints

  50. arXiv:1611.08061  [pdf, other

    cs.CV

    Recalling Holistic Information for Semantic Segmentation

    Authors: Hexiang Hu, Zhiwei Deng, Guang-tong Zhou, Fei Sha, Greg Mori

    Abstract: Semantic segmentation requires a detailed labeling of image pixels by object category. Information derived from local image patches is necessary to describe the detailed shape of individual objects. However, this information is ambiguous and can result in noisy labels. Global inference of image content can instead capture the general semantic concepts present. We advocate that high-recall holistic… ▽ More

    Submitted 23 November, 2016; originally announced November 2016.