Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–40 of 40 results for author: Zadeh, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03599  [pdf, other

    cs.CV cs.GR cs.LG

    Hi5: 2D Hand Pose Estimation with Zero Human Annotation

    Authors: Masum Hasan, Cengiz Ozel, Nina Long, Alexander Martin, Samuel Potter, Tariq Adnan, Sangwu Lee, Amir Zadeh, Ehsan Hoque

    Abstract: We propose a new large synthetic hand pose estimation dataset, Hi5, and a novel inexpensive method for collecting high-quality synthetic data that requires no human annotation or validation. Leveraging recent advancements in computer graphics, high-fidelity 3D hand models with diverse genders and skin colors, and dynamic environments and camera movements, our data synthesis pipeline allows precise… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2404.17187  [pdf, other

    cs.LG

    An Explainable Deep Reinforcement Learning Model for Warfarin Maintenance Dosing Using Policy Distillation and Action Forging

    Authors: Sadjad Anzabi Zadeh, W. Nick Street, Barrett W. Thomas

    Abstract: Deep Reinforcement Learning is an effective tool for drug dosing for chronic condition management. However, the final protocol is generally a black box without any justification for its prescribed doses. This paper addresses this issue by proposing an explainable dosing protocol for warfarin using a Proximal Policy Optimization method combined with Policy Distillation. We introduce Action Forging… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  3. arXiv:2311.02390  [pdf, other

    cs.NI cs.AI

    AI-based Self-healing Solutions Applied to Cellular Networks: An Overview

    Authors: Jaleh Farmani, Amirreza Khalil Zadeh

    Abstract: In this article, we provide an overview of machine learning (ML) methods, both classical and deep variants, that are used to implement self-healing for cell outages in cellular networks. Self-healing is a promising approach to network management, which aims to detect and compensate for cell outages in an autonomous way. This technology aims to decrease the expenses associated with the installation… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  4. arXiv:2303.03267  [pdf, other

    cs.CL cs.SD eess.AS

    Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding

    Authors: Yingting Li, Ambuj Mehrish, Shuai Zhao, Rishabh Bhardwaj, Amir Zadeh, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models. Parameter inefficiency can however arise when, during transfer learning, all the parameters of a large pre-trained model need to be updated for individual downstream tasks. As the number of parameters grows, fine-tuning is prone to overfitting and catastrophic forgetting. In addition, full fine-tunin… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023

  5. arXiv:2209.03430  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.MM

    Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

    Authors: Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

    Abstract: Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design computer agents with intelligent capabilities such as understanding, reasoning, and learning through integrating multiple communicative modalities, including linguistic, acoustic, visual, tactile, and physiological messages. With the recent interest in video understanding, embodied autonomous agents, tex… ▽ More

    Submitted 20 February, 2023; v1 submitted 7 September, 2022; originally announced September 2022.

  6. arXiv:2208.01036  [pdf, other

    cs.LG cs.AI cs.CV

    Face-to-Face Contrastive Learning for Social Intelligence Question-Answering

    Authors: Alex Wilf, Martin Q. Ma, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

    Abstract: Creating artificial social intelligence - algorithms that can understand the nuances of multi-person interactions - is an exciting and emerging challenge in processing facial expressions and gestures from multimodal videos. Recent multimodal methods have set the state of the art on many tasks, but have difficulty modeling the complex face-to-face conversational dynamics across speaking turns in so… ▽ More

    Submitted 27 October, 2022; v1 submitted 29 July, 2022; originally announced August 2022.

  7. arXiv:2204.13666  [pdf, other

    cs.LG cs.AR

    Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

    Authors: Miloš Nikolić, Enrique Torres Sanchez, Jiahui Wang, Ali Hadi Zadeh, Mostafa Mahmoud, Ameer Abdelhadi, Kareem Ibrahim, Andreas Moshovos

    Abstract: The transfer of tensors from/to memory during neural network training dominates time and energy. To improve energy efficiency and performance, research has been exploring ways to use narrower data representations. So far, these attempts relied on user-directed trial-and-error to achieve convergence. We present methods that relieve users from this responsibility. Our methods dynamically adjust the… ▽ More

    Submitted 16 May, 2024; v1 submitted 28 April, 2022; originally announced April 2022.

  8. Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

    Authors: Ali Hadi Zadeh, Mostafa Mahmoud, Ameer Abdelhadi, Andreas Moshovos

    Abstract: Increasingly larger and better Transformer models keep advancing state-of-the-art accuracy and capability for Natural Language Processing applications. These models demand more computational power, storage, and energy. Mokey reduces the footprint of state-of-the-art 32-bit or 16-bit floating-point transformer models by quantizing all values to 4-bit indexes into dictionaries of representative 16-b… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Accepted at the 49th IEEE/ACM International Symposium on Computer Architecture (ISCA '22)

  9. Optimizing Warfarin Dosing using Deep Reinforcement Learning

    Authors: Sadjad Anzabi Zadeh, W. Nick Street, Barrett W. Thomas

    Abstract: Warfarin is a widely used anticoagulant, and has a narrow therapeutic range. Dosing of warfarin should be individualized, since slight overdosing or underdosing can have catastrophic or even fatal consequences. Despite much research on warfarin dosing, current dosing protocols do not live up to expectations, especially for patients sensitive to warfarin. We propose a deep reinforcement learning-ba… ▽ More

    Submitted 23 December, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: 32 pages (including 3 appendices)

    Journal ref: Journal of Biomedical Informatics, 137 (2023) 104267

  10. arXiv:2110.13422  [pdf, other

    cs.LG cs.AI stat.ML

    Relay Variational Inference: A Method for Accelerated Encoderless VI

    Authors: Amir Zadeh, Santiago Benoit, Louis-Philippe Morency

    Abstract: Variational Inference (VI) offers a method for approximating intractable likelihoods. In neural VI, inference of approximate posteriors is commonly done using an encoder. Alternatively, encoderless VI offers a framework for learning generative models from data without encountering suboptimalities caused by amortization via an encoder (e.g. in presence of missing or uncertain data). However, in abs… ▽ More

    Submitted 13 January, 2023; v1 submitted 26 October, 2021; originally announced October 2021.

  11. arXiv:2108.01260  [pdf, other

    cs.CL

    M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations

    Authors: Dushyant Singh Chauhan, Gopendra Vikram Singh, Navonil Majumder, Amir Zadeh, Asif Ekbal, Pushpak Bhattacharyya, Louis-philippe Morency, Soujanya Poria

    Abstract: Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great demand to build models and systems that support m… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: ICMI 2021

  12. arXiv:2107.13669  [pdf, other

    cs.AI

    Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

    Authors: Wei Han, Hui Chen, Alexander Gelbukh, Amir Zadeh, Louis-philippe Morency, Soujanya Poria

    Abstract: Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data. This research area's major concern lies in developing an extraordinary fusion scheme that can extract and integrate key information from various modalities. However, one issue that may restrict previous work to ach… ▽ More

    Submitted 28 August, 2021; v1 submitted 28 July, 2021; originally announced July 2021.

    Comments: Accepted at ICMI 2021

  13. arXiv:2101.00574  [pdf, other

    cs.LG cs.AI stat.ML

    StarNet: Gradient-free Training of Deep Generative Models using Determined System of Linear Equations

    Authors: Amir Zadeh, Santiago Benoit, Louis-Philippe Morency

    Abstract: In this paper we present an approach for training deep generative models solely based on solving determined systems of linear equations. A network that uses this approach, called a StarNet, has the following desirable properties: 1) training requires no gradient as solution to the system of linear equations is not stochastic, 2) is highly scalable when solving the system of linear equations w.r.t… ▽ More

    Submitted 3 January, 2021; originally announced January 2021.

    Comments: Work in progress at CMU

  14. arXiv:2012.15358  [pdf

    cs.AI cs.RO eess.SY

    A Review into Data Science and Its Approaches in Mechanical Engineering

    Authors: Ashkan Yousefi Zadeh, Meysam Shahbazy

    Abstract: Nowadays it is inevitable to use intelligent systems to improve the performance and optimization of different components of devices or factories. Furthermore, it's so essential to have appropriate predictions to make better decisions in businesses, medical studies, and engineering studies, etc. One of the newest and most widely used of these methods is a field called Data Science that all of the s… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

    Comments: For associated information, see https://civilica.com/doc/1128400/

  15. arXiv:2010.11985  [pdf, other

    cs.CL cs.CV cs.LG cs.MM

    MTAG: Modal-Temporal Attention Graph for Unaligned Human Multimodal Language Sequences

    Authors: Jianing Yang, Yongxin Wang, Ruitao Yi, Yuying Zhu, Azaan Rehman, Amir Zadeh, Soujanya Poria, Louis-Philippe Morency

    Abstract: Human communication is multimodal in nature; it is through multiple modalities such as language, voice, and facial expressions, that opinions and emotions are expressed. Data in this domain exhibits complex multi-relational and temporal interactions. Learning from this data is a fundamentally challenging research problem. In this paper, we propose Modal-Temporal Attention Graph (MTAG). MTAG is an… ▽ More

    Submitted 28 April, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: NAACL 2021

  16. arXiv:2010.09522  [pdf, other

    cs.CV cs.CL

    Multimodal Research in Vision and Language: A Review of Current and Emerging Trends

    Authors: Shagun Uppal, Sarthak Bhagat, Devamanyu Hazarika, Navonil Majumdar, Soujanya Poria, Roger Zimmermann, Amir Zadeh

    Abstract: Deep Learning and its applications have cascaded impactful research and development with a diverse range of modalities present in the real-world data. More recently, this has enhanced research interests in the intersection of the Vision and Language arena with its numerous applications and fast-paced growth. In this paper, we present a detailed overview of the latest trends in research pertaining… ▽ More

    Submitted 21 December, 2020; v1 submitted 19 October, 2020; originally announced October 2020.

  17. arXiv:2010.08065  [pdf, other

    cs.AR cs.AI

    FPRaker: A Processing Element For Accelerating Neural Network Training

    Authors: Omar Mohamed Awad, Mostafa Mahmoud, Isak Edo, Ali Hadi Zadeh, Ciaran Bannon, Anand Jayarajan, Gennady Pekhimenko, Andreas Moshovos

    Abstract: We present FPRaker, a processing element for composing training accelerators. FPRaker processes several floating-point multiply-accumulation operations concurrently and accumulates their result into a higher precision accumulator. FPRaker boosts performance and energy efficiency during training by taking advantage of the values that naturally appear during training. Specifically, it processes the… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

  18. TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

    Authors: Mostafa Mahmoud, Isak Edo, Ali Hadi Zadeh, Omar Mohamed Awad, Gennady Pekhimenko, Jorge Albericio, Andreas Moshovos

    Abstract: TensorDash is a hardware level technique for enabling data-parallel MAC units to take advantage of sparsity in their input operand streams. When used to compose a hardware accelerator for deep learning, TensorDash can speedup the training process while also increasing energy efficiency. TensorDash combines a low-cost, sparse input operand interconnect comprising an 8-input multiplexer per multipli… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

  19. arXiv:2007.03626  [pdf, other

    cs.CL cs.CV cs.LG stat.ML

    What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets

    Authors: Jianing Yang, Yuying Zhu, Yongxin Wang, Ruitao Yi, Amir Zadeh, Louis-Philippe Morency

    Abstract: Question answering biases in video QA datasets can mislead multimodal model to overfit to QA artifacts and jeopardize the model's ability to generalize. Understanding how strong these QA biases are and where they come from helps the community measure progress more accurately and provide researchers insights to debug their models. In this paper, we analyze QA biases in popular video question answer… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

  20. arXiv:2005.06607  [pdf, other

    cs.CL

    Improving Aspect-Level Sentiment Analysis with Aspect Extraction

    Authors: Navonil Majumder, Rishabh Bhardwaj, Soujanya Poria, Amir Zadeh, Alexander Gelbukh, Amir Hussain, Louis-Philippe Morency

    Abstract: Aspect-based sentiment analysis (ABSA), a popular research area in NLP has two distinct parts -- aspect extraction (AE) and labeling the aspects with sentiment polarity (ALSA). Although distinct, these two tasks are highly correlated. The work primarily hypothesize that transferring knowledge from a pre-trained AE model can benefit the performance of ALSA models. Based on this hypothesis, word emb… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

  21. GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference

    Authors: Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos

    Abstract: Attention-based models have demonstrated remarkable success in various natural language understanding tasks. However, efficient execution remains a challenge for these models which are memory-bound due to their massive number of parameters. We present GOBO, a model quantization technique that compresses the vast majority (typically 99.9%) of the 32-bit floating-point parameters of state-of-the-art… ▽ More

    Submitted 26 September, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

    Comments: Accepted at the 53rd IEEE/ACM International Symposium on Microarchitecture - MICRO 2020

  22. arXiv:1912.09423  [pdf, ps, other

    cs.LG cs.NE stat.ML

    Pseudo-Encoded Stochastic Variational Inference

    Authors: Amir Zadeh, Smon Hessner, Yao-Chong Lim, Louis-Phlippe Morency

    Abstract: Posterior inference in directed graphical models is commonly done using a probabilistic encoder (a.k.a inference model) conditioned on the input. Often this inference model is trained jointly with the probabilistic decoder (a.k.a generator model). If probabilistic encoder encounters complexities during training (e.g. suboptimal complxity or parameterization), then learning reaches a suboptimal obj… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

  23. arXiv:1911.09826  [pdf, other

    cs.LG cs.CL stat.ML

    Factorized Multimodal Transformer for Multimodal Sequential Learning

    Authors: Amir Zadeh, Chengfeng Mao, Kelly Shi, Yiwei Zhang, Paul Pu Liang, Soujanya Poria, Louis-Philippe Morency

    Abstract: The complex world around us is inherently multimodal and sequential (continuous). Information is scattered across different modalities and requires multiple continuous sensors to be captured. As machine learning leaps towards better generalization to real world, multimodal sequential learning becomes a fundamental research area. Arguably, modeling arbitrarily distributed spatio-temporal dynamics w… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

  24. arXiv:1911.09783  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    WildMix Dataset and Spectro-Temporal Transformer Model for Monoaural Audio Source Separation

    Authors: Amir Zadeh, Tianjun Ma, Soujanya Poria, Louis-Philippe Morency

    Abstract: Monoaural audio source separation is a challenging research area in machine learning. In this area, a mixture containing multiple audio sources is given, and a model is expected to disentangle the mixture into isolated atomic sources. In this paper, we first introduce a challenging new dataset for monoaural source separation called WildMix. WildMix is designed with the goal of extending the bounda… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

  25. arXiv:1908.05787  [pdf, other

    cs.LG cs.CL stat.ML

    Integrating Multimodal Information in Large Pretrained Transformers

    Authors: Wasifur Rahman, Md. Kamrul Hasan, Sangwu Lee, Amir Zadeh, Chengfeng Mao, Louis-Philippe Morency, Ehsan Hoque

    Abstract: Recent Transformer-based contextual word representations, including BERT and XLNet, have shown state-of-the-art performance in multiple disciplines within NLP. Fine-tuning the trained contextual models on task-specific datasets has been the key to achieving superior performance downstream. While fine-tuning these pre-trained models is straightforward for lexical applications (applications with onl… ▽ More

    Submitted 21 November, 2020; v1 submitted 15 August, 2019; originally announced August 2019.

  26. arXiv:1904.06618  [pdf, other

    cs.LG cs.CL stat.ML

    UR-FUNNY: A Multimodal Language Dataset for Understanding Humor

    Authors: Md Kamrul Hasan, Wasifur Rahman, Amir Zadeh, Jianyuan Zhong, Md Iftekhar Tanveer, Louis-Philippe Morency, Mohammed, Hoque

    Abstract: Humor is a unique and creative communicative behavior displayed during social interactions. It is produced in a multimodal manner, through the usage of words (text), gestures (vision) and prosodic cues (acoustic). Understanding humor from these three modalities falls within boundaries of multimodal language; a recent research trend in natural language processing that models natural language as it… ▽ More

    Submitted 13 April, 2019; originally announced April 2019.

    Journal ref: EMNLP-IJCNLP, 2019, 2046-2056

  27. arXiv:1903.00840  [pdf, other

    cs.LG cs.AI stat.ML

    Variational Auto-Decoder: A Method for Neural Generative Modeling from Incomplete Data

    Authors: Amir Zadeh, Yao-Chong Lim, Paul Pu Liang, Louis-Philippe Morency

    Abstract: Learning a generative model from partial data (data with missingness) is a challenging area of machine learning research. We study a specific implementation of the Auto-Encoding Variational Bayes (AEVB) algorithm, named in this paper as a Variational Auto-Decoder (VAD). VAD is a generic framework which uses Variational Bayes and Markov Chain Monte Carlo (MCMC) methods to learn a generative model f… ▽ More

    Submitted 3 January, 2021; v1 submitted 3 March, 2019; originally announced March 2019.

    Comments: Link to code and data available from https://github.com/A2Zadeh/Variational-Autodecoder

  28. arXiv:1811.09362  [pdf, other

    cs.CL cs.AI

    Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors

    Authors: Yansen Wang, Ying Shen, Zhun Liu, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

    Abstract: Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker intentions often vary dynamically depending on different nonverbal contexts, such as vocal patterns and facial expressions. As a result, when modeling human language, it is essential to not only consider the literal meaning of the words but also the nonverbal contexts… ▽ More

    Submitted 25 November, 2018; v1 submitted 23 November, 2018; originally announced November 2018.

    Comments: Accepted by AAAI2019

  29. arXiv:1809.04931  [pdf, other

    cs.HC cs.CV cs.LG

    Multimodal Local-Global Ranking Fusion for Emotion Recognition

    Authors: Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

    Abstract: Emotion recognition is a core research area at the intersection of artificial intelligence and human communication analysis. It is a significant technical challenge since humans display their emotions through complex idiosyncratic combinations of the language, visual and acoustic modalities. In contrast to traditional multimodal fusion techniques, we approach emotion recognition from both direct p… ▽ More

    Submitted 12 August, 2018; originally announced September 2018.

    Comments: ACM International Conference on Multimodal Interaction (ICMI 2018)

  30. arXiv:1808.03920  [pdf, other

    cs.LG cs.AI cs.CL cs.NE stat.ML

    Multimodal Language Analysis with Recurrent Multistage Fusion

    Authors: Paul Pu Liang, Ziyin Liu, Amir Zadeh, Louis-Philippe Morency

    Abstract: Computational modeling of human multimodal language is an emerging research area in natural language processing spanning the language, visual and acoustic modalities. Comprehending multimodal language requires modeling not only the interactions within each modality (intra-modal interactions) but more importantly the interactions between modalities (cross-modal interactions). In this paper, we prop… ▽ More

    Submitted 12 August, 2018; originally announced August 2018.

    Comments: EMNLP 2018

  31. arXiv:1806.06176  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Learning Factorized Multimodal Representations

    Authors: Yao-Hung Hubert Tsai, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: Learning multimodal representations is a fundamentally complex research problem due to the presence of multiple heterogeneous sources of information. Although the presence of multiple modalities provides additional valuable information, there are two key challenges to address when learning from multimodal data: 1) models must learn the complex intra-modal and cross-modal interactions for predictio… ▽ More

    Submitted 14 May, 2019; v1 submitted 15 June, 2018; originally announced June 2018.

    Comments: ICLR 2019

  32. arXiv:1806.00064  [pdf, other

    cs.AI cs.LG stat.ML

    Efficient Low-rank Multimodal Fusion with Modality-Specific Factors

    Authors: Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

    Abstract: Multimodal research is an emerging field of artificial intelligence, and one of the main research problems in this field is multimodal fusion. The fusion of multimodal data is the process of integrating multiple unimodal representations into one compact multimodal representation. Previous research in this field has exploited the expressiveness of tensors for multimodal representation. However, the… ▽ More

    Submitted 31 May, 2018; originally announced June 2018.

    Comments: * Equal contribution. 10 pages. Accepted by ACL 2018

  33. arXiv:1802.00927  [pdf, other

    cs.LG cs.AI

    Memory Fusion Network for Multi-view Sequential Learning

    Authors: Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, Louis-Philippe Morency

    Abstract: Multi-view sequential learning is a fundamental problem in machine learning dealing with multi-view sequences. In a multi-view sequence, there exists two forms of interactions between different views: view-specific interactions and cross-view interactions. In this paper, we present a new neural architecture for multi-view sequential learning called the Memory Fusion Network (MFN) that explicitly a… ▽ More

    Submitted 3 February, 2018; originally announced February 2018.

    Comments: AAAI 2018 Oral Presentation

  34. arXiv:1802.00924  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning

    Authors: Minghai Chen, Sen Wang, Paul Pu Liang, Tadas Baltrušaitis, Amir Zadeh, Louis-Philippe Morency

    Abstract: With the increasing popularity of video sharing websites such as YouTube and Facebook, multimodal sentiment analysis has received increasing attention from the scientific community. Contrary to previous works in multimodal sentiment analysis which focus on holistic information in speech segments such as bag of words representations and average facial expression intensity, we develop a novel deep a… ▽ More

    Submitted 3 February, 2018; originally announced February 2018.

    Comments: ICMI 2017 Oral Presentation, Honorable Mention Award

  35. arXiv:1802.00923  [pdf, other

    cs.AI cs.CL cs.LG

    Multi-attention Recurrent Network for Human Communication Comprehension

    Authors: Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria, Louis-Philippe Morency

    Abstract: Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each mod… ▽ More

    Submitted 3 February, 2018; originally announced February 2018.

    Comments: AAAI 2018 Oral Presentation

  36. arXiv:1707.07250  [pdf, other

    cs.CL

    Tensor Fusion Network for Multimodal Sentiment Analysis

    Authors: Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, Louis-Philippe Morency

    Abstract: Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language. In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. We introduce a novel model, termed Tensor Fusion Ne… ▽ More

    Submitted 23 July, 2017; originally announced July 2017.

    Comments: Accepted as full paper in EMNLP 2017

  37. arXiv:1705.02735  [pdf, other

    cs.CL cs.CY

    Combating Human Trafficking with Deep Multimodal Models

    Authors: Edmund Tong, Amir Zadeh, Cara Jones, Louis-Philippe Morency

    Abstract: Human trafficking is a global epidemic affecting millions of people across the planet. Sex trafficking, the dominant form of human trafficking, has seen a significant rise mostly due to the abundance of escort websites, where human traffickers can openly advertise among at-will escort advertisements. In this paper, we take a major step in the automatic detection of advertisements suspected to pert… ▽ More

    Submitted 7 May, 2017; originally announced May 2017.

    Comments: ACL 2017 Long Paper

  38. arXiv:1611.08657  [pdf, other

    cs.CV cs.AI

    Convolutional Experts Constrained Local Model for Facial Landmark Detection

    Authors: Amir Zadeh, Tadas Baltrušaitis, Louis-Philippe Morency

    Abstract: Constrained Local Models (CLMs) are a well-established family of methods for facial landmark detection. However, they have recently fallen out of favor to cascaded regression-based approaches. This is in part due to the inability of existing CLM local detectors to model the very complex individual landmark appearance that is affected by expression, illumination, facial hair, makeup, and accessorie… ▽ More

    Submitted 26 July, 2017; v1 submitted 25 November, 2016; originally announced November 2016.

    Comments: Accepted at CVPR-W 2017

  39. arXiv:1606.06259  [pdf

    cs.CL cs.MM

    MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos

    Authors: Amir Zadeh, Rowan Zellers, Eli Pincus, Louis-Philippe Morency

    Abstract: People are sharing their opinions, stories and reviews through online video sharing websites every day. Studying sentiment and subjectivity in these opinion videos is experiencing a growing attention from academia and industry. While sentiment analysis has been successful for text, it is an understudied research question for videos and multimedia content. The biggest setbacks for studies in this d… ▽ More

    Submitted 11 August, 2016; v1 submitted 20 June, 2016; originally announced June 2016.

    Comments: Accepted as Journal Publication in IEEE Intelligent Systems

    Journal ref: IEEE Intelligent Systems 31.6 (2016): 82-88

  40. arXiv:1511.02402  [pdf, ps, other

    cs.LG

    Max-Sum Diversification, Monotone Submodular Functions and Semi-metric Spaces

    Authors: Sepehr Abbasi Zadeh, Mehrdad Ghadiri

    Abstract: In many applications such as web-based search, document summarization, facility location and other applications, the results are preferable to be both representative and diversified subsets of documents. The goal of this study is to select a good "quality", bounded-size subset of a given set of items, while maintaining their diversity relative to a semi-metric distance function. This problem was f… ▽ More

    Submitted 7 November, 2015; originally announced November 2015.

    Comments: This article draws heavily from arXiv:1203.6397 by other authors