Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Reflects downloads up to 14 Oct 2024Bibliometrics
Skip Table Of Content Section
research-article
Efficient Light Field Image Compression with Enhanced Random Access
Article No.: 44, Pages 1–18https://doi.org/10.1145/3471905

In light field image compression, facilitating random access to individual views plays a significant role in decoding views quickly, reducing memory footprint, and decreasing the bandwidth requirement for transmission. Highly efficient light field image ...

research-article
Evaluation of an Intervention Program Based on Mobile Apps to Learn Sexism Prevention in Teenagers
Article No.: 45, Pages 1–20https://doi.org/10.1145/3471139

The fight against sexism is nowadays one of the flagship social movements in western countries. Adolescence is a crucial period, and some empirical studies have focused on the socialization of teenagers, proving that the socialization with the surrounding ...

research-article
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition
Article No.: 46, Pages 1–24https://doi.org/10.1145/3472722

Rapid progress and superior performance have been achieved for skeleton-based action recognition recently. In this article, we investigate this problem under a cross-dataset setting, which is a new, pragmatic, and challenging task in real-world scenarios. ...

research-article
Open Access
An Effective Forest Fire Detection Framework Using Heterogeneous Wireless Multimedia Sensor Networks
Article No.: 47, Pages 1–21https://doi.org/10.1145/3473037

With improvements in the area of Internet of Things (IoT), surveillance systems have recently become more accessible. At the same time, optimizing the energy requirements of smart sensors, especially for data transmission, has always been very ...

research-article
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Article No.: 48, Pages 1–16https://doi.org/10.1145/3473140

Vision-language pre-training has been an emerging and fast-developing research topic, which transfers multi-modal knowledge from rich-resource pre-training task to limited-resource downstream tasks. Unlike existing works that predominantly learn a single ...

research-article
Cascaded Structure-Learning Network with Using Adversarial Training for Robust Facial Landmark Detection
Article No.: 49, Pages 1–20https://doi.org/10.1145/3474595

Recently, great progress has been achieved on facial landmark detection based on convolutional neural network, while it is still challenging due to partial occlusion and extreme head pose. In this paper, we propose a Cascaded Structure-Learning Network (...

research-article
Machine Learning Based Content-Agnostic Viewport Prediction for 360-Degree Video
Article No.: 50, Pages 1–24https://doi.org/10.1145/3474833

Accurate and fast estimations or predictions of the (near) future location of the users of head-mounted devices within the virtual omnidirectional environment open a plethora of opportunities in application domains such as interactive immersive gaming and ...

research-article
Generating Virtual Wire Sculptural Art from 3D Models
Article No.: 51, Pages 1–23https://doi.org/10.1145/3475798

Wire sculptures are objects sculpted by the use of wires. In this article, we propose practical methods to create 3D virtual wire sculptural art from a given 3D model. In contrast, most of the previous 3D wire art results are reconstructed from input 2D ...

research-article
Response Generation by Jointly Modeling Personalized Linguistic Styles and Emotions
Article No.: 52, Pages 1–20https://doi.org/10.1145/3475872

Natural language generation (NLG) has been an essential technique for various applications, like XiaoIce and Siri, and engaged increasing attention recently. To improve the user experience, several emotion-aware NLG methods have been developed to generate ...

research-article
An l½ and Graph Regularized Subspace Clustering Method for Robust Image Segmentation
Article No.: 53, Pages 1–24https://doi.org/10.1145/3476514

Segmenting meaningful visual structures from an image is a fundamental and most-addressed problem in image analysis algorithms. However, among factors such as diverse visual patterns, noise, complex backgrounds, and similar textures present in foreground ...

research-article
Will You Ever Become Popular? Learning to Predict Virality of Dance Clips
Article No.: 54, Pages 1–24https://doi.org/10.1145/3477533

Dance challenges are going viral in video communities like TikTok nowadays. Once a challenge becomes popular, thousands of short-form videos will be uploaded within a couple of days. Therefore, virality prediction from dance challenges is of great ...

research-article
Deep Semantic and Attentive Network for Unsupervised Video Summarization
Article No.: 55, Pages 1–21https://doi.org/10.1145/3477538

With the rapid growth of video data, video summarization is a promising approach to shorten a lengthy video into a compact version. Although supervised summarization approaches have achieved state-of-the-art performance, they require frame-level annotated ...

research-article
Moment is Important: Language-Based Video Moment Retrieval via Adversarial Learning
Article No.: 56, Pages 1–21https://doi.org/10.1145/3478025

The newly emerging language-based video moment retrieval task aims at retrieving a target video moment from an untrimmed video given a natural language as the query. It is more applicable in reality since it is able to accurately localize a specific video ...

research-article
Learning Transferable Perturbations for Image Captioning
Article No.: 57, Pages 1–18https://doi.org/10.1145/3478024

Present studies have discovered that state-of-the-art deep learning models can be attacked by small but well-designed perturbations. Existing attack algorithms for the image captioning task is time-consuming, and their generated adversarial examples ...

research-article
SADnet: Semi-supervised Single Image Dehazing Method Based on an Attention Mechanism
Article No.: 58, Pages 1–23https://doi.org/10.1145/3478457

Many real-life tasks such as military reconnaissance and traffic monitoring require high-quality images. However, images acquired in foggy or hazy weather pose obstacles to the implementation of these real-life tasks; consequently, image dehazing is an ...

research-article
Tell, Imagine, and Search: End-to-end Learning for Composing Text and Image to Image Retrieval
Article No.: 59, Pages 1–23https://doi.org/10.1145/3478642

Composing Text and Image to Image Retrieval (CTI-IR) is an emerging task in computer vision, which allows retrieving images relevant to a query image with text describing desired modifications to the query image. Most conventional cross-modal retrieval ...

research-article
Structure-aware Meta-fusion for Image Super-resolution
Article No.: 60, Pages 1–25https://doi.org/10.1145/3477553

There are two main categories of image super-resolution algorithms: distortion oriented and perception oriented. Recent evidence shows that reconstruction accuracy and perceptual quality are typically in disagreement with each other. In this article, we ...

research-article
Non-Acted Text and Keystrokes Database and Learning Methods to Recognize Emotions
Article No.: 61, Pages 1–24https://doi.org/10.1145/3480968

The modern computing applications are presently adapting to the convenient availability of huge and diverse data for making their pattern recognition methods smarter. Identification of dominant emotion solely based on the text data generated by humans is ...

research-article
Transform, Warp, and Dress: A New Transformation-guided Model for Virtual Try-on
Article No.: 62, Pages 1–24https://doi.org/10.1145/3491226

Virtual try-on has recently emerged in computer vision and multimedia communities with the development of architectures that can generate realistic images of a target person wearing a custom garment. This research interest is motivated by the large role ...

research-article
Adversarial Multi-Grained Embedding Network for Cross-Modal Text-Video Retrieval
Article No.: 63, Pages 1–23https://doi.org/10.1145/3483381

Cross-modal retrieval between texts and videos has received consistent research interest in the multimedia community. Existing studies follow a trend of learning a joint embedding space to measure the distance between text and video representations. In ...

research-article
Fully Unsupervised Person Re-Identification via Selective Contrastive Learning
Article No.: 64, Pages 1–15https://doi.org/10.1145/3485061

Person re-identification (ReID) aims at searching the same identity person among images captured by various cameras. Existing fully supervised person ReID methods usually suffer from poor generalization capability caused by domain gaps. Unsupervised ...

research-article
Music2Dance: DanceNet for Music-Driven Dance Generation
Article No.: 65, Pages 1–21https://doi.org/10.1145/3485664

Synthesize human motions from music (i.e., music to dance) is appealing and has attracted lots of research interests in recent years. It is challenging because of the requirement for realistic and complex human motions for dance, but more importantly, the ...

survey
Understanding and Creating Art with AI: Review and Outlook
Article No.: 66, Pages 1–22https://doi.org/10.1145/3475799

Technologies related to artificial intelligence (AI) have a strong impact on the changes of research and creative practices in visual arts. The growing number of research initiatives and creative applications that emerge in the intersection of AI and art ...

Subjects

Comments