Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context

Published: 01 May 2013 Publication History

Abstract

This work focuses on automatically analyzing a speaker's sentiment in online videos containing movie reviews. In addition to textual information, this approach considers adding audio features as typically used in speech-based emotion recognition as well as video features encoding valuable valence information conveyed by the speaker. Experimental results indicate that training on written movie reviews is a promising alternative to exclusively using (spoken) in-domain data for building a system that analyzes spoken movie review videos, and that language-independent audio-visual analysis can compete with linguistic analysis.

Cited By

View all
  • (2024)Structural link prediction model with multi-view text semantic feature extractionIntelligent Decision Technologies10.3233/IDT-24002218:3(2421-2437)Online publication date: 1-Jan-2024
  • (2024)Travel Agency Task Dialogue Corpus: A Multimodal Dataset with Age-Diverse SpeakersACM Transactions on Asian and Low-Resource Language Information Processing10.1145/367516623:9(1-23)Online publication date: 26-Jun-2024
  • (2024)A Survey of Cutting-edge Multimodal Sentiment AnalysisACM Computing Surveys10.1145/365214956:9(1-38)Online publication date: 25-Apr-2024
  • Show More Cited By

Recommendations

Reviews

Jin Chen

Sentiment analysis has been an interesting research topic in recent years. Previous work has focused on generic problem settings where only textual description is available, which is essentially a natural language processing task. In terms of movie review sentiment analysis, however, linguistic approaches suffer from the fact that sentiment words in these movie reviews might indicate the characteristics of the movie rather than those of the reviewer's opinions. In this work, the authors attempt to fuse audio and video features with linguistic features to help classify the sentiment of the review. Acoustic features include low-level descriptors (LLD) and their derivatives, and video features mainly focus on the smile expression, scaled by intensity from 0 to 100. The experimental results on different classification fusion settings show that audio and video features are useful in adding discriminating information, although it is not clear whether the proposed fusion approach depends on a specific choice of audio and video features. It would be interesting to see if other facial expressions might be useful. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image IEEE Intelligent Systems
IEEE Intelligent Systems  Volume 28, Issue 3
May 2013
92 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 May 2013

Author Tags

  1. Context awareness
  2. Feature extraction
  3. Motion pictures
  4. Pragmatics
  5. Videos
  6. Visualization
  7. YouTube
  8. affective computing
  9. audio-visual pattern recognition
  10. intelligent systems
  11. linguistic analysis
  12. sentiment analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Structural link prediction model with multi-view text semantic feature extractionIntelligent Decision Technologies10.3233/IDT-24002218:3(2421-2437)Online publication date: 1-Jan-2024
  • (2024)Travel Agency Task Dialogue Corpus: A Multimodal Dataset with Age-Diverse SpeakersACM Transactions on Asian and Low-Resource Language Information Processing10.1145/367516623:9(1-23)Online publication date: 26-Jun-2024
  • (2024)A Survey of Cutting-edge Multimodal Sentiment AnalysisACM Computing Surveys10.1145/365214956:9(1-38)Online publication date: 25-Apr-2024
  • (2024)Multimodal Reaction: Information Modulation for Cross-Modal Representation LearningIEEE Transactions on Multimedia10.1109/TMM.2023.329333526(2178-2191)Online publication date: 1-Jan-2024
  • (2024)A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in ConversationsIEEE Transactions on Multimedia10.1109/TMM.2023.327101926(776-788)Online publication date: 1-Jan-2024
  • (2024)Video summarization via knowledge-aware multimodal deep networksKnowledge-Based Systems10.1016/j.knosys.2024.111670293:COnline publication date: 7-Jun-2024
  • (2024)Multimodal Emotion Recognition with Deep LearningInformation Fusion10.1016/j.inffus.2023.102218105:COnline publication date: 1-May-2024
  • (2024)A survey on semantic processing techniquesInformation Fusion10.1016/j.inffus.2023.101988101:COnline publication date: 1-Jan-2024
  • (2023)Research on Animation Sentiment Analysis based on Deep Neural NetworkProceedings of the 2023 4th International Conference on Machine Learning and Computer Application10.1145/3650215.3650239(125-129)Online publication date: 27-Oct-2023
  • (2023)Chinese Event Extraction Algorithm of Multi-Information Semantic EnhancementsProceedings of the 2023 12th International Conference on Computing and Pattern Recognition10.1145/3633637.3633646(58-64)Online publication date: 27-Oct-2023
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media