Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2024
Dance-to-Music Generation with Encoder-based Textual Inversion
SA '24: SIGGRAPH Asia 2024 Conference PapersArticle No.: 135, Pages 1–11https://doi.org/10.1145/3680528.3687562The seamless integration of music with dance movements is essential for communicating the artistic intent of a dance piece. This alignment also significantly improves the immersive quality of gaming experiences and animation productions. Although there ...
- research-articleAugust 2024
Identity-Preserving Face Swapping via Dual Surrogate Generative Models
ACM Transactions on Graphics (TOG), Volume 43, Issue 5Article No.: 161, Pages 1–19https://doi.org/10.1145/3676165In this study, we revisit the fundamental setting of face-swapping models and reveal that only using implicit supervision for training leads to the difficulty of advanced methods to preserve the source identity. We propose a novel reverse pseudo-input ...
- research-articleJanuary 2025
Revealing the two sides of data augmentation: an asymmetric distillation-based win-win solution for open-set recognition
IJCAI '24: Proceedings of the Thirty-Third International Joint Conference on Artificial IntelligenceArticle No.: 101, Pages 911–919https://doi.org/10.24963/ijcai.2024/101In this paper, we reveal the two sides of data augmentation: enhancements in closed-set recognition correlate with a significant decrease in open-set recognition. Through empirical investigation, we find that multi-sample-based augmentations would ...
- research-articleAugust 2024
E-commerce Retail Merchandise Based on Optimized K-means Algorithm and Multi-model Fusion Demand Forecasting Research
ICSCIS '24: Proceedings of the 2024 International Conference on Smart City and Information SystemPages 512–516https://doi.org/10.1145/3685088.3685178The new e-commerce industry in the field of e-commerce incorporating promotional factors has an important role to play. By accurately forecasting the demand for e-commerce products, it is possible to satisfy consumer needs, improve sales and marketing ...
- research-articleJanuary 2025
Music style transfer with time-varying inversion of diffusion models
AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial IntelligenceArticle No.: 62, Pages 547–555https://doi.org/10.1609/aaai.v38i1.27810With the development of diffusion models, text-guided image style transfer has demonstrated high-quality controllable synthesis results. However, the utilization of text for diverse music style transfer poses significant challenges, primarily due to the ...
-
- research-articleDecember 2023
<inline-formula><tex-math notation="LaTeX">${A^{2}Pt}$</tex-math></inline-formula>: Anti-Associative Prompt Tuning for Open Set Visual Recognition
IEEE Transactions on Multimedia (TOM), Volume 26Pages 8419–8431https://doi.org/10.1109/TMM.2023.3339387Multi-modality pre-trained models (PTMs) have considerably boosted the performance on a broad range of computer vision topics. Still, they have not been explored purposefully in open set recognition (OSR) scenarios when applying PTMs to downstream ...
- research-articleDecember 2023
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models
- Yuxin Zhang,
- Weiming Dong,
- Fan Tang,
- Nisha Huang,
- Haibin Huang,
- Chongyang Ma,
- Tong-Yee Lee,
- Oliver Deussen,
- Changsheng Xu
ACM Transactions on Graphics (TOG), Volume 42, Issue 6Article No.: 244, Pages 1–14https://doi.org/10.1145/3618342Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image ...
- research-articleJuly 2023
A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning
ACM Transactions on Graphics (TOG), Volume 42, Issue 5Article No.: 169, Pages 1–16https://doi.org/10.1145/3605548This work presents Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, that can fit in most existing arbitrary image style transfer models, such as CNN-based, ViT-based, and flow-based ...
- research-articleMay 2023
CrossRectify: Leveraging disagreement for semi-supervised object detection
Highlights- We point out that the performances of self-labeling-based semi-supervised object detection (SSOD) approaches are always limited, and the reason behind such ...
Semi-supervised object detection has recently achieved substantial progress. As a mainstream solution, the self-labeling-based methods train the detector on both labeled data and unlabeled data with pseudo labels predicted by the ...
- research-articleApril 2023
Semantic-Context Graph Network for Point-Based 3D Object Detection
IEEE Transactions on Circuits and Systems for Video Technology (IEEETCSVT), Volume 33, Issue 11Pages 6474–6486https://doi.org/10.1109/TCSVT.2023.3271318Point-based indoor 3D object detection has received increasing attention with the large demand for augmented reality, autonomous driving, and robot technology in the industry. However, the detection precision suffers from inputs with semantic ambiguity, ...
- research-articleOctober 2023
SPA<sup>2</sup>Net: Structure-Preserved Attention Activated Network for Weakly Supervised Object Localization
IEEE Transactions on Image Processing (TIP), Volume 32Pages 5779–5793https://doi.org/10.1109/TIP.2023.3323793By exploring the localizable representations in deep CNN, weakly supervised object localization (WSOL) methods could determine the position of the object in each image just trained by the classification task. However, the partial activation problem caused ...
- research-articleMarch 2023
SMNet: Synchronous Multi-Scale Low Light Enhancement Network With Local and Global Concern
IEEE Transactions on Multimedia (TOM), Volume 25Pages 9506–9517https://doi.org/10.1109/TMM.2023.3254141Limited by objectively poor lighting conditions and hardware devices, low-light images with low visual quality and low visibility are inevitable in the real world. Accurate local details and reasonable global information play their essential and distinct ...
- posterDecember 2022
Language-driven Diversified Image Retargeting
Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper,...
- research-articleOctober 2022
Bias oriented unbiased data augmentation for cross-bias representation learning
Multimedia Systems (MUME), Volume 29, Issue 2Pages 725–738https://doi.org/10.1007/s00530-022-01013-6AbstractThe biased cues in the training data may build strong connections between specific targets and unexpected concepts, leading the learned representations could not be applied to real-world data that does not contain the same biased cues. To learn ...
- research-articleOctober 2022
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion
MM '22: Proceedings of the 30th ACM International Conference on MultimediaPages 1085–1094https://doi.org/10.1145/3503161.3548282Digital art synthesis is receiving increasing attention in the multimedia community because of engaging the public with art effectively. Current digital art synthesis methods usually use single-modality inputs as guidance, thereby limiting the ...
- research-articleJuly 2022
Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning
SIGGRAPH '22: ACM SIGGRAPH 2022 Conference ProceedingsArticle No.: 12, Pages 1–8https://doi.org/10.1145/3528233.3530736In this work, we tackle the challenging problem of arbitrary image style transfer using a novel style feature representation learning method. A suitable style representation, as a key component in image stylization tasks, is essential to achieve ...
- research-articleJune 2022
A Comparative Study of CNN- and Transformer-Based Visual Style Transfer
Journal of Computer Science and Technology (JCST), Volume 37, Issue 3Pages 601–614https://doi.org/10.1007/s11390-022-2140-7AbstractVision Transformer has shown impressive performance on the image classification tasks. Observing that most existing visual style transfer (VST) algorithms are based on the texture-biased convolution neural network (CNN), here raises the question ...
- research-articleJuly 2021
Distribution Aligned Multimodal and Multi-domain Image Stylization
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 17, Issue 3Article No.: 96, Pages 1–17https://doi.org/10.1145/3450525Multimodal and multi-domain stylization are two important problems in the field of image style transfer. Currently, there are few methods that can perform multimodal and multi-domain stylization simultaneously. In this study, we propose a unified ...
- research-articleMay 2021
Destylization of text with decorative elements
MMAsia '20: Proceedings of the 2nd ACM International Conference on Multimedia in AsiaArticle No.: 14, Pages 1–7https://doi.org/10.1145/3444685.3446324Style text with decorative elements has a strong visual sense, and enriches our daily work, study and life. However, it introduces new challenges to text detection and recognition. In this study, we propose a text destylized framework, that can ...
- research-articleJanuary 2021
Exploring the Representativity of Art Paintings
IEEE Transactions on Multimedia (TOM), Volume 23Pages 2794–2805https://doi.org/10.1109/TMM.2020.3016887Art painting evaluation is sophisticated for a novice with no or limited knowledge on art criticism, and history. In this study, we propose the concept of <italic>representativity</italic> to evaluate paintings instead of using professional concepts, such ...