Artificial intelligence

Applied Filters

People

Publications

Conferences

Publication Date

71 Results for: Book/Issue: MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,802,176 records)|Limit your search to The ACM Full-Text Collection (771,782 records)

Showing 1 - 20of71 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
January 2024
Summary of the 2023 PAIR-LITEON Competition: Embedded AI Object Detection Model Design Contest on Fish-eye Around-view Cameras
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 113, Pages 1–7https://doi.org/10.1145/3595916.3628352

This competition is dedicated to achieving fisheye object detection in Asia, particularly in countries like Taiwan, while emphasizing low power consumption and simultaneously achieving a high mean average precision (mAP). This task is notably challenging ...
0
59
Metrics
Total Citations0
Total Downloads59
Last 12 Months59
Last 6 weeks4
Get Access
research-article
January 2024
Object Detection via Fisheye Camera
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 112, Pages 1–7https://doi.org/10.1145/3595916.3628351

During the competition, several factors that could decrease the effectiveness of the training result was quickly identified, such as the lack of distortion of provided training data, the high similarity between multiple images, and the extreme imbalance ...
0
135
Metrics
Total Citations0
Total Downloads135
Last 12 Months135
Last 6 weeks7
Get Access
research-article
January 2024
Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label Approach
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 111, Pages 1–6https://doi.org/10.1145/3595916.3628350

In this paper, we introduce a lightweight object detection system, custom-designed for fisheye cameras and optimized for quick deployment on embedded systems. Given the constraints of training solely on standard images, our methodology centers on the ...
0
87
Metrics
Total Citations0
Total Downloads87
Last 12 Months87
Last 6 weeks6
Get Access
short-paper
Open Access
January 2024
Contextual Associated Triplet Queries for Panoptic Scene Graph Generation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 100, Pages 1–5https://doi.org/10.1145/3595916.3626745

The Panoptic Scene Graph generation (PSG) task aims to extract the triplets composed of subject, object, and relation based on panoptic segmentation. For one-stage methods, PSGTR predicts the subject, object, and relation by one query. However, the ...
0
278
Metrics
Total Citations0
Total Downloads278
Last 12 Months278
Last 6 weeks28
View online with eReader
View this article in HTML format
PDF
short-paper
Open Access
January 2024
EmAGAN: Embedded Blocks Search and Mask Attention GAN for Makeup Transfer
- Li Yan,
- Wang Shibin
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 99, Pages 1–5https://doi.org/10.1145/3595916.3626743

Currently, the results of makeup transfer are generally satisfactory in most scenarios. However, the transfer results show that the transfer makeup details is not accurate, such as in blush and lip corners. To this end, we propose a variant model of ...
0
74
Metrics
Total Citations0
Total Downloads74
Last 12 Months74
Last 6 weeks9
View online with eReader
View this article in HTML format
PDF
research-article
Open Access
January 2024
An Evaluation of Decentralized Group Formation Techniques for Flying Light Specks
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 84, Pages 1–7https://doi.org/10.1145/3595916.3626460

Group formation is fundamental for 3D displays that use Flying Light Specks, FLSs, to illuminate shapes and provide haptic interactions. An FLS is a drone with light sources that illuminates a shape. Groups of G FLSs may implement reliability techniques ...
0
240
Metrics
Total Citations0
Total Downloads240
Last 12 Months240
Last 6 weeks22
View online with eReader
View this article in HTML format
PDF
research-article
January 2024
Monocular 3D Pose Estimation of Very Small Airplane in the Air
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 82, Pages 1–7https://doi.org/10.1145/3595916.3626456

In this paper, a novel pose estimation algorithm is proposed specifically for maneuvering airplanes in the air. The algorithm consists of two main stages. The first stage involves semantic segmentation of a monocular input image of a flying airplane, ...
0
79
Metrics
Total Citations0
Total Downloads79
Last 12 Months79
Last 6 weeks2
Get Access
research-article
January 2024
Improving Class Representation for Zero-Shot Action Recognition
- Lijuan Zhou,
- Jianing Mao
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 79, Pages 1–7https://doi.org/10.1145/3595916.3626453

Zero-Shot Action Recognition (ZSAR) enables models to infer new action classes from previously seen data without any samples of those new classes. How an action class is represented in an understandable and processable format influences the performance ...
0
68
Metrics
Total Citations0
Total Downloads68
Last 12 Months68
Last 6 weeks2
1
Supplementary Material
Appendix
Get Access
research-article
January 2024
Open-Vocabulary Segmentation Approach for Transformer-Based Food Nutrient Estimation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 78, Pages 1–7https://doi.org/10.1145/3595916.3626452

Nutrition plays a vital role in overall health and well-being. With a highly accurate nutrient estimation model, we develop a tool that displays nutritional values from food images, thereby reducing the labor-intensiveness of dietary assessment. We ...
2
167
Metrics
Total Citations2
Total Downloads167
Last 12 Months167
Last 6 weeks21
Get Access
research-article
January 2024
Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation
- Ft Zheng,
- Le Hui,
- Jin Xie,
- Haofeng Zhang
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 75, Pages 1–7https://doi.org/10.1145/3595916.3626449

3D point cloud semantic segmentation is a fundamental task for 3D scene understanding. However, most existing pipelines usually use k-NN or ball query operation to form hard neighborhoods, which may cross different semantic objects, resulting low-...
0
174
Metrics
Total Citations0
Total Downloads174
Last 12 Months174
Last 6 weeks17
Get Access
research-article
Open Access
January 2024
MontageNet: Annotated Dataset of Furniture Components in Real-World Images
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 73, Pages 1–7https://doi.org/10.1145/3595916.3626447

Indoor understanding is currently a topic that is widely studied in the field of machine learning. Furniture is the most common object in indoor scenes, just as various vehicles are most commonly seen in street scenes. Any object is made up of a ...
0
915
Metrics
Total Citations0
Total Downloads915
Last 12 Months915
Last 6 weeks110
View online with eReader
View this article in HTML format
PDF
research-article
January 2024
GTTrack: Gaussian Transformer Tracker for Visual Tracking
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 72, Pages 1–7https://doi.org/10.1145/3595916.3626446

Recently, Transformer based visual object tracking methods have achieved impressive advancements and significantly improved tracking performance. Transformer includes two modules of self-attention and cross-attention for those methods. However, it ...
0
63
Metrics
Total Citations0
Total Downloads63
Last 12 Months63
Last 6 weeks1
Get Access
research-article
January 2024
SASSM: Semantic Awareness and Self-Support Matching for Semi-Supervised Video Object Segmentation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 71, Pages 1–7https://doi.org/10.1145/3595916.3626445

Matching-based methods have becamed popular in semi-supervised video object segmentation (VOS), by maintaining a memory bank to predict object masks. However, these methods encounter challenges for fast motions and appearance changes, resulting in ...
0
48
Metrics
Total Citations0
Total Downloads48
Last 12 Months48
Last 6 weeks1
Get Access
research-article
January 2024
End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 70, Pages 1–7https://doi.org/10.1145/3595916.3626444

Recently, neural network-based image compression techniques have demonstrated remarkable compression performance. The use of context-adaptive entropy models greatly enhances the rate-distortion (R-D) performance by effectively capturing spatial ...
0
107
Metrics
Total Citations0
Total Downloads107
Last 12 Months107
Last 6 weeks9
Get Access
research-article
January 2024
Dual-domain Feature Learning and Cross Dimension Interaction Attention for Nighttime Image Dehazing
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 68, Pages 1–7https://doi.org/10.1145/3595916.3626442

Nighttime image dehazing is critical for many computer applications. Directly transferring daytime dehazing models to nighttime scenes often introduces haze residual, detail loss and color distortion for the uneven distribution by artificial lights. ...
0
45
Metrics
Total Citations0
Total Downloads45
Last 12 Months45
Last 6 weeks1
Get Access
research-article
January 2024
ADNet: An Asymmetric Dual-Stream Network for RGB-T Salient Object Detection
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 66, Pages 1–7https://doi.org/10.1145/3595916.3626440

RGB-Thermal salient object detection (RGB-T SOD) aims to locate salient objects in images that include both RGB and thermal information. Previous approaches often suggest designing a symmetric network structure to tackle the challenge of dealing with ...
0
81
Metrics
Total Citations0
Total Downloads81
Last 12 Months81
Last 6 weeks2
Get Access
research-article
January 2024
Optical Flow based Feature Prediction and Decomposed Context for Video Compression
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 65, Pages 1–7https://doi.org/10.1145/3595916.3626439

In recent years, there have been a growing interest in developing end-to-end neural video codecs. Previous works generally use a past decoded frame as reference directly, utilizing the motion information between it and the input frame to reduce temporal ...
0
102
Metrics
Total Citations0
Total Downloads102
Last 12 Months102
Last 6 weeks3
Get Access
research-article
January 2024
Geometric Style Transfer for Face Portraits
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 64, Pages 1–7https://doi.org/10.1145/3595916.3626438

Geometric style transfer jointly stylizes the texture and geometry of a content image to better match a style image, which has attracted widespread attention due to its various applications. However, existing style transfer methods either primarily ...
1
78
Metrics
Total Citations1
Total Downloads78
Last 12 Months78
Last 6 weeks3
Get Access
research-article
January 2024
Generic Attention-model Explainability by Weighted Relevance Accumulation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 63, Pages 1–7https://doi.org/10.1145/3595916.3626437

Attention-based Transformer models have achieved remarkable progress in multi-modal tasks, such as visual question answering. The explainability of attention-based methods has recently attracted wide interest as it can explain the inner changes of ...
1
71
Metrics
Total Citations1
Total Downloads71
Last 12 Months71
Last 6 weeks2
1
Supplementary Material
Appendix
Get Access
research-article
January 2024
Rethinking Parking Slot Detection with Rotated Bounding Box
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 62, Pages 1–7https://doi.org/10.1145/3595916.3626436

Parking slot detection is an essential yet challenging task in the field of self-driving perception. During parking, vehicles often block part of the parking slots which makes the corners occluded. In addition, due to the impact of the external ...
0
64
Metrics
Total Citations0
Total Downloads64
Last 12 Months64
Last 6 weeks1
Get Access

Applied Filters

People

Names

Institutions

Authors

Editors

Publications

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Caption

Summary of the 2023 PAIR-LITEON Competition: Embedded AI Object Detection Model Design Contest on Fish-eye Around-view Cameras

Object Detection via Fisheye Camera

Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label Approach

Contextual Associated Triplet Queries for Panoptic Scene Graph Generation

EmAGAN: Embedded Blocks Search and Mask Attention GAN for Makeup Transfer

An Evaluation of Decentralized Group Formation Techniques for Flying Light Specks

Monocular 3D Pose Estimation of Very Small Airplane in the Air

Improving Class Representation for Zero-Shot Action Recognition

Open-Vocabulary Segmentation Approach for Transformer-Based Food Nutrient Estimation

Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation

MontageNet: Annotated Dataset of Furniture Components in Real-World Images

GTTrack: Gaussian Transformer Tracker for Visual Tracking

SASSM: Semantic Awareness and Self-Support Matching for Semi-Supervised Video Object Segmentation

End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation

Dual-domain Feature Learning and Cross Dimension Interaction Attention for Nighttime Image Dehazing

ADNet: An Asymmetric Dual-Stream Network for RGB-T Salient Object Detection

Optical Flow based Feature Prediction and Decomposed Context for Video Compression

Geometric Style Transfer for Face Portraits

Generic Attention-model Explainability by Weighted Relevance Accumulation

Rethinking Parking Slot Detection with Rotated Bounding Box