TOMM: Vol 18, No 2

Volume 18, Issue 2May 2022

Volume 18, Issue 2

May 2022

Editor:

Alberto Del Bimbo
University of Firenze, Italy

Publisher:

Association for Computing Machinery
New York
NY
United States

ISSN:1551-6857

EISSN:1551-6865

Tags:

Subscribe to Journal Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Bibliometrics

Issue Downloads

PDFfront matter (TOC, masthead, submission information)

Select All

Export Citations Save to Binder

research-article

Efficient Light Field Image Compression with Enhanced Random Access

Article No.: 44, Pages 1–18https://doi.org/10.1145/3471905

In light field image compression, facilitating random access to individual views plays a significant role in decoding views quickly, reducing memory footprint, and decreasing the bandwidth requirement for transmission. Highly efficient light field image ...

research-article

Evaluation of an Intervention Program Based on Mobile Apps to Learn Sexism Prevention in Teenagers

Article No.: 45, Pages 1–20https://doi.org/10.1145/3471139

The fight against sexism is nowadays one of the flagship social movements in western countries. Adolescence is a crucial period, and some empirical studies have focused on the socialization of teenagers, proving that the socialization with the surrounding ...

research-article

Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition

Article No.: 46, Pages 1–24https://doi.org/10.1145/3472722

Rapid progress and superior performance have been achieved for skeleton-based action recognition recently. In this article, we investigate this problem under a cross-dataset setting, which is a new, pragmatic, and challenging task in real-world scenarios. ...

research-article

Open Access

An Effective Forest Fire Detection Framework Using Heterogeneous Wireless Multimedia Sensor Networks

Article No.: 47, Pages 1–21https://doi.org/10.1145/3473037

With improvements in the area of Internet of Things (IoT), surveillance systems have recently become more accessible. At the same time, optimizing the energy requirements of smart sensors, especially for data transmission, has always been very ...

research-article

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training

Article No.: 48, Pages 1–16https://doi.org/10.1145/3473140

Vision-language pre-training has been an emerging and fast-developing research topic, which transfers multi-modal knowledge from rich-resource pre-training task to limited-resource downstream tasks. Unlike existing works that predominantly learn a single ...

research-article

Cascaded Structure-Learning Network with Using Adversarial Training for Robust Facial Landmark Detection

Article No.: 49, Pages 1–20https://doi.org/10.1145/3474595

Recently, great progress has been achieved on facial landmark detection based on convolutional neural network, while it is still challenging due to partial occlusion and extreme head pose. In this paper, we propose a Cascaded Structure-Learning Network (...

research-article

Machine Learning Based Content-Agnostic Viewport Prediction for 360-Degree Video

Article No.: 50, Pages 1–24https://doi.org/10.1145/3474833

Accurate and fast estimations or predictions of the (near) future location of the users of head-mounted devices within the virtual omnidirectional environment open a plethora of opportunities in application domains such as interactive immersive gaming and ...

research-article

Generating Virtual Wire Sculptural Art from 3D Models

Article No.: 51, Pages 1–23https://doi.org/10.1145/3475798

Wire sculptures are objects sculpted by the use of wires. In this article, we propose practical methods to create 3D virtual wire sculptural art from a given 3D model. In contrast, most of the previous 3D wire art results are reconstructed from input 2D ...

research-article

Response Generation by Jointly Modeling Personalized Linguistic Styles and Emotions

Article No.: 52, Pages 1–20https://doi.org/10.1145/3475872

Natural language generation (NLG) has been an essential technique for various applications, like XiaoIce and Siri, and engaged increasing attention recently. To improve the user experience, several emotion-aware NLG methods have been developed to generate ...

research-article

An l_½ and Graph Regularized Subspace Clustering Method for Robust Image Segmentation

Article No.: 53, Pages 1–24https://doi.org/10.1145/3476514

Segmenting meaningful visual structures from an image is a fundamental and most-addressed problem in image analysis algorithms. However, among factors such as diverse visual patterns, noise, complex backgrounds, and similar textures present in foreground ...

research-article

Will You Ever Become Popular? Learning to Predict Virality of Dance Clips

Article No.: 54, Pages 1–24https://doi.org/10.1145/3477533

Dance challenges are going viral in video communities like TikTok nowadays. Once a challenge becomes popular, thousands of short-form videos will be uploaded within a couple of days. Therefore, virality prediction from dance challenges is of great ...

research-article

Deep Semantic and Attentive Network for Unsupervised Video Summarization

Article No.: 55, Pages 1–21https://doi.org/10.1145/3477538

With the rapid growth of video data, video summarization is a promising approach to shorten a lengthy video into a compact version. Although supervised summarization approaches have achieved state-of-the-art performance, they require frame-level annotated ...

research-article

Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning

Article No.: 56, Pages 1–21https://doi.org/10.1145/3478025

The newly emerging language-based video moment retrieval task aims at retrieving a target video moment from an untrimmed video given a natural language as the query. It is more applicable in reality since it is able to accurately localize a specific video ...

research-article

Learning Transferable Perturbations for Image Captioning

Article No.: 57, Pages 1–18https://doi.org/10.1145/3478024

Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples ...

research-article

SADnet: Semi-supervised Single Image Dehazing Method Based on an Attention Mechanism

Article No.: 58, Pages 1–23https://doi.org/10.1145/3478457

Many real-life tasks such as military reconnaissance and traffic monitoring require high-quality images. However, images acquired in foggy or hazy weather pose obstacles to the implementation of these real-life tasks; consequently, image dehazing is an ...

research-article

Tell, Imagine, and Search: End-to-end Learning for Composing Text and Image to Image Retrieval

Article No.: 59, Pages 1–23https://doi.org/10.1145/3478642

Composing Text and Image to Image Retrieval (CTI-IR) is an emerging task in computer vision, which allows retrieving images relevant to a query image with text describing desired modifications to the query image. Most conventional cross-modal retrieval ...

research-article

Structure-aware Meta-fusion for Image Super-resolution

Article No.: 60, Pages 1–25https://doi.org/10.1145/3477553

There are two main categories of image super-resolution algorithms: distortion oriented and perception oriented. Recent evidence shows that reconstruction accuracy and perceptual quality are typically in disagreement with each other. In this article, we ...

research-article

Non-Acted Text and Keystrokes Database and Learning Methods to Recognize Emotions

Article No.: 61, Pages 1–24https://doi.org/10.1145/3480968

The modern computing applications are presently adapting to the convenient availability of huge and diverse data for making their pattern recognition methods smarter. Identification of dominant emotion solely based on the text data generated by humans is ...

research-article

Transform, Warp, and Dress: A New Transformation-guided Model for Virtual Try-on

Article No.: 62, Pages 1–24https://doi.org/10.1145/3491226

Virtual try-on has recently emerged in computer vision and multimedia communities with the development of architectures that can generate realistic images of a target person wearing a custom garment. This research interest is motivated by the large role ...

research-article

Adversarial Multi-Grained Embedding Network for Cross-Modal Text-Video Retrieval

Article No.: 63, Pages 1–23https://doi.org/10.1145/3483381

Cross-modal retrieval between texts and videos has received consistent research interest in the multimedia community. Existing studies follow a trend of learning a joint embedding space to measure the distance between text and video representations. In ...

research-article

Fully Unsupervised Person Re-Identification via Selective Contrastive Learning

Article No.: 64, Pages 1–15https://doi.org/10.1145/3485061

Person re-identification (ReID) aims at searching the same identity person among images captured by various cameras. Existing fully supervised person ReID methods usually suffer from poor generalization capability caused by domain gaps. Unsupervised ...

research-article

Music2Dance: DanceNet for Music-Driven Dance Generation

Article No.: 65, Pages 1–21https://doi.org/10.1145/3485664

Synthesize human motions from music (i.e., music to dance) is appealing and has attracted lots of research interests in recent years. It is challenging because of the requirement for realistic and complex human motions for dance, but more importantly, the ...

survey

Understanding and Creating Art with AI: Review and Outlook

Article No.: 66, Pages 1–22https://doi.org/10.1145/3475799

Technologies related to artificial intelligence (AI) have a strong impact on the changes of research and creative practices in visual arts. The growing number of research initiatives and creative applications that emerge in the intersection of AI and art ...

ACM Transactions on Multimedia Computing, Communications, and Applications

Sections

Issue Downloads

Efficient Light Field Image Compression with Enhanced Random Access

Evaluation of an Intervention Program Based on Mobile Apps to Learn Sexism Prevention in Teenagers

Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition

An Effective Forest Fire Detection Framework Using Heterogeneous Wireless Multimedia Sensor Networks

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training

Cascaded Structure-Learning Network with Using Adversarial Training for Robust Facial Landmark Detection

Machine Learning Based Content-Agnostic Viewport Prediction for 360-Degree Video

Generating Virtual Wire Sculptural Art from 3D Models

Response Generation by Jointly Modeling Personalized Linguistic Styles and Emotions

An l_½ and Graph Regularized Subspace Clustering Method for Robust Image Segmentation

Will You Ever Become Popular? Learning to Predict Virality of Dance Clips

Deep Semantic and Attentive Network for Unsupervised Video Summarization

Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning

Learning Transferable Perturbations for Image Captioning

SADnet: Semi-supervised Single Image Dehazing Method Based on an Attention Mechanism

Tell, Imagine, and Search: End-to-end Learning for Composing Text and Image to Image Retrieval

Structure-aware Meta-fusion for Image Super-resolution

Non-Acted Text and Keystrokes Database and Learning Methods to Recognize Emotions

Transform, Warp, and Dress: A New Transformation-guided Model for Virtual Try-on

Adversarial Multi-Grained Embedding Network for Cross-Modal Text-Video Retrieval

Fully Unsupervised Person Re-Identification via Selective Contrastive Learning

Music2Dance: DanceNet for Music-Driven Dance Generation

Understanding and Creating Art with AI: Review and Outlook

Sections

Issue Downloads

Save to Binder

Subjects

Comments