Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 17, Issue 4November 2021
Skip Table Of Content Section
introduction
Free
Table of Contents: Online Supplement Volume 17, Number 2s-3s
Article No.: 117e, Pages 1–5https://doi.org/10.1145/3507468
research-article
Dual-Stream Guided-Learning via a Priori Optimization for Person Re-identification
Article No.: 117, Pages 1–22https://doi.org/10.1145/3447715

The task of person re-identification (re-ID) is to find the same pedestrian across non-overlapping camera views. Generally, the performance of person re-ID can be affected by background clutter. However, existing segmentation algorithms cannot obtain ...

research-article
Adaptive Compression for Online Computer Vision: An Edge Reinforcement Learning Approach
Article No.: 118, Pages 1–23https://doi.org/10.1145/3447878

With the growth of computer vision-based applications, an explosive amount of images have been uploaded to cloud servers that host such online computer vision algorithms, usually in the form of deep learning models. JPEG has been used as the de facto ...

research-article
Smart Director: An Event-Driven Directing System for Live Broadcasting
Article No.: 119, Pages 1–18https://doi.org/10.1145/3448981

Live video broadcasting normally requires a multitude of skills and expertise with domain knowledge to enable multi-camera productions. As the number of cameras keeps increasing, directing a live sports broadcast has now become more complicated and ...

research-article
Dual-Stream Structured Graph Convolution Network for Skeleton-Based Action Recognition
Article No.: 120, Pages 1–22https://doi.org/10.1145/3450410

In this work, we propose a dual-stream structured graph convolution network (DS-SGCN) to solve the skeleton-based action recognition problem. The spatio-temporal coordinates and appearance contexts of the skeletal joints are jointly integrated into the ...

research-article
Unsupervised Domain Expansion for Visual Categorization
Article No.: 121, Pages 1–24https://doi.org/10.1145/3448108

Expanding visual categorization into a novel domain without the need of extra annotation has been a long-term interest for multimedia intelligence. Previously, this challenge has been approached by unsupervised domain adaptation (UDA). Given labeled data ...

research-article
Task-independent Recognition of Communication Skills in Group Interaction Using Time-series Modeling
Article No.: 122, Pages 1–27https://doi.org/10.1145/3450283

Case studies of group discussions are considered an effective way to assess communication skills (CS). This method can help researchers evaluate participants’ engagement with each other in a specific realistic context. In this article, multimodal analysis ...

research-article
Where Are They Going? Predicting Human Behaviors in Crowded Scenes
Article No.: 123, Pages 1–19https://doi.org/10.1145/3449359

In this article, we propose a framework for crowd behavior prediction in complicated scenarios. The fundamental framework is designed using the standard encoder-decoder scheme, which is built upon the long short-term memory module to capture the temporal ...

research-article
Using Multisensory Content to Impact the Quality of Experience of Reading Digital Books
Article No.: 124, Pages 1–18https://doi.org/10.1145/3458676

Multisensorial books enrich a story with either traditional multimedia content or sensorial effects. The main idea is to increase children’s interest in reading by enhancing their QoE while reading. Studies on enriched and/or augmented e-books also ...

research-article
Bi-Directional Co-Attention Network for Image Captioning
Article No.: 125, Pages 1–20https://doi.org/10.1145/3460474

Image Captioning, which automatically describes an image with natural language, is regarded as a fundamental challenge in computer vision. In recent years, significant advance has been made in image captioning through improving attention mechanism. ...

research-article
Cross-Domain Object Representation via Robust Low-Rank Correlation Analysis
Article No.: 126, Pages 1–20https://doi.org/10.1145/3458825

Cross-domain data has become very popular recently since various viewpoints and different sensors tend to facilitate better data representation. In this article, we propose a novel cross-domain object representation algorithm (RLRCA) which not only ...

research-article
Cross-Modal Hybrid Feature Fusion for Image-Sentence Matching
Article No.: 127, Pages 1–23https://doi.org/10.1145/3458281

Image-sentence matching is a challenging task in the field of language and vision, which aims at measuring the similarities between images and sentence descriptions. Most existing methods independently map the global features of images and sentences into ...

research-article
Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders
Article No.: 128, Pages 1–23https://doi.org/10.1145/3451390

Despite the evolution of deep-learning-based visual-textual processing systems, precise multi-modal matching remains a challenging task. In this work, we tackle the task of cross-modal retrieval through image-sentence matching based on word-region ...

research-article
Open Access
Health Status Prediction with Local-Global Heterogeneous Behavior Graph
Article No.: 129, Pages 1–21https://doi.org/10.1145/3457893

Health management is getting increasing attention all over the world. However, existing health management mainly relies on hospital examination and treatment, which are complicated and untimely. The emergence of mobile devices provides the possibility to ...

research-article
Perceptual Quality Assessment of Low-light Image Enhancement
Article No.: 130, Pages 1–24https://doi.org/10.1145/3457905

Low-light image enhancement algorithms (LIEA) can light up images captured in dark or back-lighting conditions. However, LIEA may introduce various distortions such as structure damage, color shift, and noise into the enhanced images. Despite various ...

research-article
Dissimilarity-Based Regularized Learning of Charts
Article No.: 131, Pages 1–23https://doi.org/10.1145/3458884

Chart images exhibit significant variabilities that make each image different from others even though they belong to the same class or categories. Classification of charts is a major challenge because each chart class has variations in features, structure,...

research-article
A New Foreground-Background based Method for Behavior-Oriented Social Media Image Classification
Article No.: 132, Pages 1–25https://doi.org/10.1145/3458051

Due to various applications, research on personal traits using information on social media has become an important area. In this paper, a new method for the classification of behavior-oriented social images uploaded on various social media platforms is ...

research-article
Open Access
An Adaptive Bitrate Switching Algorithm for Speech Applications in Context of WebRTC
Article No.: 133, Pages 1–21https://doi.org/10.1145/3458751

Web Real-Time Communication (WebRTC) combines a set of standards and technologies to enable high-quality audio, video, and auxiliary data exchange in web browsers and mobile applications. It enables peer-to-peer multimedia sessions over IP networks ...

research-article
A Fast View Synthesis Implementation Method for Light Field Applications
Article No.: 134, Pages 1–20https://doi.org/10.1145/3459098

View synthesis (VS) for light field images is a very time-consuming task due to the great quantity of involved pixels and intensive computations, which may prevent it from the practical three-dimensional real-time systems. In this article, we propose an ...

research-article
Bayesian Covariance Representation with Global Informative Prior for 3D Action Recognition
Article No.: 135, Pages 1–22https://doi.org/10.1145/3460235

For the merits of high-order statistics and Riemannian geometry, covariance matrix has become a generic feature representation for action recognition. An independent action can be represented by an empirical statistics over all of its pose samples. Two ...

research-article
Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array
Article No.: 136, Pages 1–24https://doi.org/10.1145/3460511

The panorama stitching system is an indispensable module in surveillance or space exploration. Such a system enables the viewer to understand the surroundings instantly by aligning the surrounding images on a plane and fusing them naturally. The ...

research-article
Y-Net: Dual-branch Joint Network for Semantic Segmentation
Article No.: 137, Pages 1–22https://doi.org/10.1145/3460940

Most existing segmentation networks are built upon a “U-shaped” encoder–decoder structure, where the multi-level features extracted by the encoder are gradually aggregated by the decoder. Although this structure has been proven to be effective in ...

research-article
Detecting Non-Aligned Double JPEG Compression Based on Amplitude-Angle Feature
Article No.: 138, Pages 1–18https://doi.org/10.1145/3464388

Due to the popularity of JPEG format images in recent years, JPEG images will inevitably involve image editing operation. Thus, some tramped images will leave tracks of Non-aligned double JPEG (NA-DJPEG) compression. By detecting the presence of NA-DJPEG ...

research-article
Residual-guided In-loop Filter Using Convolution Neural Network
Article No.: 139, Pages 1–19https://doi.org/10.1145/3460820

The block-based coding structure in the hybrid video coding framework inevitably introduces compression artifacts such as blocking, ringing, and so on. To compensate for those artifacts, extensive filtering techniques were proposed in the loop of video ...

research-article
Trust Mechanism of Feedback Trust Weight in Multimedia Network
Article No.: 140, Pages 1–26https://doi.org/10.1145/3391296

It is necessary to solve the inaccurate data arising from data reliability ignored by most data fusion algorithms drawing upon collaborative filtering and fuzzy network theory. Therefore, a model is constructed based on the collaborative filtering ...

Subjects

Comments