TOMM: Vol 18, No 2s

SECTION: Special Section on Best Papers MM Asia 2021

research-article

Discriminative Visual Similarity Search with Semantically Cycle-consistent Hashing Networks

Article No.: 114, Pages 1–21https://doi.org/10.1145/3532519

Deep hashing has great potential in large-scale visual similarity search due to its preferable efficiency in storage and computation. Technically, deep hashing for visual similarity search inherits the powerful representation capability of deep neural ...

research-article

Open Access

Deepfake Video Detection via Predictive Representation Learning

Article No.: 115, Pages 1–21https://doi.org/10.1145/3536426

Increasingly advanced deepfake approaches have made the detection of deepfake videos very challenging. We observe that the general deepfake videos often exhibit appearance-level temporal inconsistencies in some facial components between frames, resulting ...

research-article

Open Access

LANBIQUE: LANguage-based Blind Image QUality Evaluation

Article No.: 116, Pages 1–19https://doi.org/10.1145/3538649

Image quality assessment is often performed with deep networks that are fine-tuned to regress a human provided quality score of a given image. Usually, this approach may lack generalization capabilities and, while being highly precise on similar image ...

SECTION: Special Issue on Deep Leaning Algorithms for Multimedia Data Analytics in Industry 4.0 Applications

research-article

Open Access

Smart City Construction and Management by Digital Twins and BIM Big Data in COVID-19 Scenario

Article No.: 117, Pages 1–21https://doi.org/10.1145/3529395

With the rapid development of information technology and the spread of Corona Virus Disease 2019 (COVID-19), the government and urban managers are looking for ways to use technology to make the city smarter and safer. Intelligent transportation can play a ...

research-article

A Comprehensive Study of Deep Learning-based Covert Communication

Article No.: 118, Pages 1–19https://doi.org/10.1145/3508365

Deep learning-based methods have been popular in multimedia analysis tasks, including classification, detection, segmentation, and so on. In addition to conventional applications, this model can be widely used for cover communication, i.e., information ...

research-article

Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition

Article No.: 119, Pages 1–15https://doi.org/10.1145/3538749

Currently, many action recognition methods mostly consider the information from spatial streams. We propose a new perspective inspired by the human visual system to combine both spatial and temporal streams to measure their attention consistency. ...

research-article

Perturbation-enabled Deep Federated Learning for Preserving Internet of Things-based Social Networks

Article No.: 120, Pages 1–19https://doi.org/10.1145/3537899

Federated Learning (FL), as an emerging form of distributed machine learning (ML), can protect participants’ private data from being substantially disclosed to cyber adversaries. It has potential uses in many large-scale, data-rich environments, such as ...

research-article

Dynamic Transfer Exemplar based Facial Emotion Recognition Model Toward Online Video

Article No.: 121, Pages 1–17https://doi.org/10.1145/3538385

In this article, we focus on the dynamic facial emotion recognition from online video. We combine deep neural networks with transfer learning theory and propose a novel model named DT-EFER. In detail, DT-EFER uses GoogLeNet to extract the deep features of ...

research-article

SETTI: A Self-supervised AdvErsarial Malware DeTection ArchiTecture in an IoT Environment

Article No.: 122, Pages 1–21https://doi.org/10.1145/3536425

In recent years, malware detection has become an active research topic in the area of Internet of Things (IoT) security. The principle is to exploit knowledge from large quantities of continuously generated malware. Existing algorithms practise available ...

research-article

PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications

Article No.: 123, Pages 1–18https://doi.org/10.1145/3534932

Deep Learning models’ performance strongly correlate with availability of annotated data; however, massive data labeling is laborious, expensive, and error-prone when performed by human experts. Active Learning (AL) effectively handles this challenge by ...

research-article

Deep Q Network–Driven Task Offloading for Efficient Multimedia Data Analysis in Edge Computing–Assisted IoV

Article No.: 124, Pages 1–24https://doi.org/10.1145/3548687

With the prosperity of Industry 4.0, numerous emerging industries continue to gain popularity and their market scales are expanding ceaselessly. The Internet of Vehicles (IoV), one of the thriving intelligent industries, enjoys bright development ...

research-article

Optimized Deep-Neural Network for Content-based Medical Image Retrieval in a Brownfield IoMT Network

Article No.: 125, Pages 1–26https://doi.org/10.1145/3546194

In this paper, a brownfield Internet of Medical Things network is introduced for imaging data that can be easily scaled out depending on the objectives, functional requirements, and the number of facilities and devices connected to it. This is further ...

research-article

A Sorting Fuzzy Min-Max Model in an Embedded System for Atrial Fibrillation Detection

Article No.: 126, Pages 1–18https://doi.org/10.1145/3554737

Atrial fibrillation detection (AFD) has attracted much attention in the field of embedded systems. In this study, we propose a sorting fuzzy min-max (SFMM) model, and then develop an SFMM-based embedded system for AF detection. The proposed SFMM model is ...

SECTION: Special Issue on Learning Representations, Similarities, and Associations in Dynamic Multimedia Environment

introduction

Free

Introduction to the Special Section on Learning Representations, Similarity, and Associations in Dynamic Multimedia Environments

Article No.: 127e, Pages 1–2https://doi.org/10.1145/3569952

research-article

Revisiting Local Descriptor for Improved Few-Shot Classification

Article No.: 127, Pages 1–23https://doi.org/10.1145/3511917

Few-shot classification studies the problem of quickly adapting a deep learner to understanding novel classes based on few support images. In this context, recent research efforts have been aimed at designing more and more complex classifiers that measure ...

research-article

GLPose: Global-Local Representation Learning for Human Pose Estimation

Article No.: 128, Pages 1–16https://doi.org/10.1145/3519305

Multi-frame human pose estimation is at the core of many computer vision tasks. Although state-of-the-art approaches have demonstrated remarkable results for human pose estimation on static images, their performances inevitably come short when being ...

research-article

3D Skeleton and Two Streams Approach to Person Re-identification Using Optimized Region Matching

Article No.: 129, Pages 1–17https://doi.org/10.1145/3538490

Person re-identification (Re-ID) is a challenging and arduous task due to non-overlapping views, complex background, and uncontrollable occlusion in video surveillance. An existing method for capturing pedestrian local region information is to divide ...

research-article

Rank-in-Rank Loss for Person Re-identification

Article No.: 130, Pages 1–21https://doi.org/10.1145/3532866

Person re-identification (re-ID) is commonly investigated as a ranking problem. However, the performance of existing re-ID models drops dramatically, when they encounter extreme positive-negative class imbalance (e.g., very small ratio of positive and ...

research-article

Guided Graph Attention Learning for Video-Text Matching

Article No.: 131, Pages 1–23https://doi.org/10.1145/3538533

As a bridge between videos and natural languages, video-text matching has been a hot multimedia research topic in recent years. Such cross-modal retrieval is usually achieved by learning a common embedding space where videos and text captions are directly ...

research-article

Open Access

CL²R: Compatible Lifelong Learning Representations

Article No.: 132, Pages 1–22https://doi.org/10.1145/3564786

In this article, we propose a method to partially mimic natural intelligence for the problem of lifelong learning representations that are compatible. We take the perspective of a learning agent that is interested in recognizing object instances in an ...

ACM Transactions on Multimedia Computing, Communications, and Applications

Sections

Issue Downloads

Discriminative Visual Similarity Search with Semantically Cycle-consistent Hashing Networks

Deepfake Video Detection via Predictive Representation Learning

LANBIQUE: LANguage-based Blind Image QUality Evaluation

Smart City Construction and Management by Digital Twins and BIM Big Data in COVID-19 Scenario

A Comprehensive Study of Deep Learning-based Covert Communication

Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition

Perturbation-enabled Deep Federated Learning for Preserving Internet of Things-based Social Networks

Dynamic Transfer Exemplar based Facial Emotion Recognition Model Toward Online Video

SETTI: A Self-supervised AdvErsarial Malware DeTection ArchiTecture in an IoT Environment

PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications

Deep Q Network–Driven Task Offloading for Efficient Multimedia Data Analysis in Edge Computing–Assisted IoV

Optimized Deep-Neural Network for Content-based Medical Image Retrieval in a Brownfield IoMT Network

A Sorting Fuzzy Min-Max Model in an Embedded System for Atrial Fibrillation Detection

Introduction to the Special Section on Learning Representations, Similarity, and Associations in Dynamic Multimedia Environments

Revisiting Local Descriptor for Improved Few-Shot Classification

GLPose: Global-Local Representation Learning for Human Pose Estimation

3D Skeleton and Two Streams Approach to Person Re-identification Using Optimized Region Matching

Rank-in-Rank Loss for Person Re-identification

Guided Graph Attention Learning for Video-Text Matching

CL²R: Compatible Lifelong Learning Representations

Sections

Issue Downloads

Save to Binder

Subjects

Comments