Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJanuary 2024
Summary of the 2023 PAIR-LITEON Competition: Embedded AI Object Detection Model Design Contest on Fish-eye Around-view Cameras
- Yu-Shu Ni,
- Chia-Chi Tsai,
- Jyun-Syu Lin,
- Hsien-Po Meng,
- Po-Chi Hu,
- Jiun-Shiung Chen,
- Kun-Hung Lin,
- Chih-Yuan Chuang,
- Jiun-In Guo
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 113, Pages 1–7https://doi.org/10.1145/3595916.3628352This competition is dedicated to achieving fisheye object detection in Asia, particularly in countries like Taiwan, while emphasizing low power consumption and simultaneously achieving a high mean average precision (mAP). This task is notably challenging ...
- research-articleJanuary 2024
Object Detection via Fisheye Camera
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 112, Pages 1–7https://doi.org/10.1145/3595916.3628351During the competition, several factors that could decrease the effectiveness of the training result was quickly identified, such as the lack of distortion of provided training data, the high similarity between multiple images, and the extreme imbalance ...
- research-articleJanuary 2024
Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label Approach
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 111, Pages 1–6https://doi.org/10.1145/3595916.3628350In this paper, we introduce a lightweight object detection system, custom-designed for fisheye cameras and optimized for quick deployment on embedded systems. Given the constraints of training solely on standard images, our methodology centers on the ...
- short-paperJanuary 2024
Contextual Associated Triplet Queries for Panoptic Scene Graph Generation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 100, Pages 1–5https://doi.org/10.1145/3595916.3626745The Panoptic Scene Graph generation (PSG) task aims to extract the triplets composed of subject, object, and relation based on panoptic segmentation. For one-stage methods, PSGTR predicts the subject, object, and relation by one query. However, the ...
- short-paperJanuary 2024
EmAGAN: Embedded Blocks Search and Mask Attention GAN for Makeup Transfer
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 99, Pages 1–5https://doi.org/10.1145/3595916.3626743Currently, the results of makeup transfer are generally satisfactory in most scenarios. However, the transfer results show that the transfer makeup details is not accurate, such as in blush and lip corners. To this end, we propose a variant model of ...
-
- research-articleJanuary 2024
An Evaluation of Decentralized Group Formation Techniques for Flying Light Specks
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 84, Pages 1–7https://doi.org/10.1145/3595916.3626460Group formation is fundamental for 3D displays that use Flying Light Specks, FLSs, to illuminate shapes and provide haptic interactions. An FLS is a drone with light sources that illuminates a shape. Groups of G FLSs may implement reliability techniques ...
- research-articleJanuary 2024
Monocular 3D Pose Estimation of Very Small Airplane in the Air
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 82, Pages 1–7https://doi.org/10.1145/3595916.3626456In this paper, a novel pose estimation algorithm is proposed specifically for maneuvering airplanes in the air. The algorithm consists of two main stages. The first stage involves semantic segmentation of a monocular input image of a flying airplane, ...
- research-articleJanuary 2024
Improving Class Representation for Zero-Shot Action Recognition
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 79, Pages 1–7https://doi.org/10.1145/3595916.3626453Zero-Shot Action Recognition (ZSAR) enables models to infer new action classes from previously seen data without any samples of those new classes. How an action class is represented in an understandable and processable format influences the performance ...
- research-articleJanuary 2024
Open-Vocabulary Segmentation Approach for Transformer-Based Food Nutrient Estimation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 78, Pages 1–7https://doi.org/10.1145/3595916.3626452Nutrition plays a vital role in overall health and well-being. With a highly accurate nutrient estimation model, we develop a tool that displays nutritional values from food images, thereby reducing the labor-intensiveness of dietary assessment. We ...
- research-articleJanuary 2024
Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 75, Pages 1–7https://doi.org/10.1145/3595916.36264493D point cloud semantic segmentation is a fundamental task for 3D scene understanding. However, most existing pipelines usually use k-NN or ball query operation to form hard neighborhoods, which may cross different semantic objects, resulting low-...
- research-articleJanuary 2024
MontageNet: Annotated Dataset of Furniture Components in Real-World Images
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 73, Pages 1–7https://doi.org/10.1145/3595916.3626447Indoor understanding is currently a topic that is widely studied in the field of machine learning. Furniture is the most common object in indoor scenes, just as various vehicles are most commonly seen in street scenes. Any object is made up of a ...
- research-articleJanuary 2024
GTTrack: Gaussian Transformer Tracker for Visual Tracking
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 72, Pages 1–7https://doi.org/10.1145/3595916.3626446Recently, Transformer based visual object tracking methods have achieved impressive advancements and significantly improved tracking performance. Transformer includes two modules of self-attention and cross-attention for those methods. However, it ...
- research-articleJanuary 2024
SASSM: Semantic Awareness and Self-Support Matching for Semi-Supervised Video Object Segmentation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 71, Pages 1–7https://doi.org/10.1145/3595916.3626445Matching-based methods have becamed popular in semi-supervised video object segmentation (VOS), by maintaining a memory bank to predict object masks. However, these methods encounter challenges for fast motions and appearance changes, resulting in ...
- research-articleJanuary 2024
End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 70, Pages 1–7https://doi.org/10.1145/3595916.3626444Recently, neural network-based image compression techniques have demonstrated remarkable compression performance. The use of context-adaptive entropy models greatly enhances the rate-distortion (R-D) performance by effectively capturing spatial ...
- research-articleJanuary 2024
Dual-domain Feature Learning and Cross Dimension Interaction Attention for Nighttime Image Dehazing
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 68, Pages 1–7https://doi.org/10.1145/3595916.3626442Nighttime image dehazing is critical for many computer applications. Directly transferring daytime dehazing models to nighttime scenes often introduces haze residual, detail loss and color distortion for the uneven distribution by artificial lights. ...
- research-articleJanuary 2024
ADNet: An Asymmetric Dual-Stream Network for RGB-T Salient Object Detection
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 66, Pages 1–7https://doi.org/10.1145/3595916.3626440RGB-Thermal salient object detection (RGB-T SOD) aims to locate salient objects in images that include both RGB and thermal information. Previous approaches often suggest designing a symmetric network structure to tackle the challenge of dealing with ...
- research-articleJanuary 2024
Optical Flow based Feature Prediction and Decomposed Context for Video Compression
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 65, Pages 1–7https://doi.org/10.1145/3595916.3626439In recent years, there have been a growing interest in developing end-to-end neural video codecs. Previous works generally use a past decoded frame as reference directly, utilizing the motion information between it and the input frame to reduce temporal ...
- research-articleJanuary 2024
Geometric Style Transfer for Face Portraits
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 64, Pages 1–7https://doi.org/10.1145/3595916.3626438Geometric style transfer jointly stylizes the texture and geometry of a content image to better match a style image, which has attracted widespread attention due to its various applications. However, existing style transfer methods either primarily ...
- research-articleJanuary 2024
Generic Attention-model Explainability by Weighted Relevance Accumulation
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 63, Pages 1–7https://doi.org/10.1145/3595916.3626437Attention-based Transformer models have achieved remarkable progress in multi-modal tasks, such as visual question answering. The explainability of attention-based methods has recently attracted wide interest as it can explain the inner changes of ...
- research-articleJanuary 2024
Rethinking Parking Slot Detection with Rotated Bounding Box
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in AsiaArticle No.: 62, Pages 1–7https://doi.org/10.1145/3595916.3626436Parking slot detection is an essential yet challenging task in the field of self-driving perception. During parking, vehicles often block part of the parking slots which makes the corners occluded. In addition, due to the impact of the external ...