Author: Tang, Fan : Search

research-article

Open Access

Dance-to-Music Generation with Encoder-based Textual Inversion

SA '24: SIGGRAPH Asia 2024 Conference PapersArticle No.: 135, Pages 1–11https://doi.org/10.1145/3680528.3687562

The seamless integration of music with dance movements is essential for communicating the artistic intent of a dance piece. This alignment also significantly improves the immersive quality of gaming experiences and animation productions. Although there ...

research-article

Identity-Preserving Face Swapping via Dual Surrogate Generative Models

ACM Transactions on Graphics (TOG), Volume 43, Issue 5Article No.: 161, Pages 1–19https://doi.org/10.1145/3676165

In this study, we revisit the fundamental setting of face-swapping models and reveal that only using implicit supervision for training leads to the difficulty of advanced methods to preserve the source identity. We propose a novel reverse pseudo-input ...

research-article

Revealing the two sides of data augmentation: an asymmetric distillation-based win-win solution for open-set recognition

IJCAI '24: Proceedings of the Thirty-Third International Joint Conference on Artificial IntelligenceArticle No.: 101, Pages 911–919https://doi.org/10.24963/ijcai.2024/101

In this paper, we reveal the two sides of data augmentation: enhancements in closed-set recognition correlate with a significant decrease in open-set recognition. Through empirical investigation, we find that multi-sample-based augmentations would ...

research-article

E-commerce Retail Merchandise Based on Optimized K-means Algorithm and Multi-model Fusion Demand Forecasting Research

ICSCIS '24: Proceedings of the 2024 International Conference on Smart City and Information SystemPages 512–516https://doi.org/10.1145/3685088.3685178

The new e-commerce industry in the field of e-commerce incorporating promotional factors has an important role to play. By accurately forecasting the demand for e-commerce products, it is possible to satisfy consumer needs, improve sales and marketing ...

research-article

Music style transfer with time-varying inversion of diffusion models

AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial IntelligenceArticle No.: 62, Pages 547–555https://doi.org/10.1609/aaai.v38i1.27810

With the development of diffusion models, text-guided image style transfer has demonstrated high-quality controllable synthesis results. However, the utilization of text for diverse music style transfer poses significant challenges, primarily due to the ...

research-article

<inline-formula><tex-math notation="LaTeX">${A^{2}Pt}$</tex-math></inline-formula>: Anti-Associative Prompt Tuning for Open Set Visual Recognition

IEEE Transactions on Multimedia (TOM), Volume 26Pages 8419–8431https://doi.org/10.1109/TMM.2023.3339387

Multi-modality pre-trained models (PTMs) have considerably boosted the performance on a broad range of computer vision topics. Still, they have not been explored purposefully in open set recognition (OSR) scenarios when applying PTMs to downstream ...

research-article

Open Access

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

ACM Transactions on Graphics (TOG), Volume 42, Issue 6Article No.: 244, Pages 1–14https://doi.org/10.1145/3618342

Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image ...

research-article

Open Access

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning

ACM Transactions on Graphics (TOG), Volume 42, Issue 5Article No.: 169, Pages 1–16https://doi.org/10.1145/3605548

This work presents Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, that can fit in most existing arbitrary image style transfer models, such as CNN-based, ViT-based, and flow-based ...

research-article

CrossRectify: Leveraging disagreement for semi-supervised object detection

Pattern Recognition (PATT), Volume 137, Issue Chttps://doi.org/10.1016/j.patcog.2022.109280

Highlights

We point out that the performances of self-labeling-based semi-supervised object detection (SSOD) approaches are always limited, and the reason behind such ...

Abstract

Semi-supervised object detection has recently achieved substantial progress. As a mainstream solution, the self-labeling-based methods train the detector on both labeled data and unlabeled data with pseudo labels predicted by the ...

research-article

Semantic-Context Graph Network for Point-Based 3D Object Detection

IEEE Transactions on Circuits and Systems for Video Technology (IEEETCSVT), Volume 33, Issue 11Pages 6474–6486https://doi.org/10.1109/TCSVT.2023.3271318

Point-based indoor 3D object detection has received increasing attention with the large demand for augmented reality, autonomous driving, and robot technology in the industry. However, the detection precision suffers from inputs with semantic ambiguity, ...

research-article

SPA<sup>2</sup>Net: Structure-Preserved Attention Activated Network for Weakly Supervised Object Localization

IEEE Transactions on Image Processing (TIP), Volume 32Pages 5779–5793https://doi.org/10.1109/TIP.2023.3323793

By exploring the localizable representations in deep CNN, weakly supervised object localization (WSOL) methods could determine the position of the object in each image just trained by the classification task. However, the partial activation problem caused ...

research-article

SMNet: Synchronous Multi-Scale Low Light Enhancement Network With Local and Global Concern

IEEE Transactions on Multimedia (TOM), Volume 25Pages 9506–9517https://doi.org/10.1109/TMM.2023.3254141

Limited by objectively poor lighting conditions and hardware devices, low-light images with low visual quality and low visibility are inevitable in the real world. Accurate local details and reasonable global information play their essential and distinct ...

poster

Language-driven Diversified Image Retargeting

SA '22: SIGGRAPH Asia 2022 PostersArticle No.: 19, Pages 1–2https://doi.org/10.1145/3550082.3564169

Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper,...

research-article

Bias oriented unbiased data augmentation for cross-bias representation learning

Multimedia Systems (MUME), Volume 29, Issue 2Pages 725–738https://doi.org/10.1007/s00530-022-01013-6

Abstract

The biased cues in the training data may build strong connections between specific targets and unexpected concepts, leading the learned representations could not be applied to real-world data that does not contain the same biased cues. To learn ...

research-article

Open Access

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion

MM '22: Proceedings of the 30th ACM International Conference on MultimediaPages 1085–1094https://doi.org/10.1145/3503161.3548282

Digital art synthesis is receiving increasing attention in the multimedia community because of engaging the public with art effectively. Current digital art synthesis methods usually use single-modality inputs as guidance, thereby limiting the ...

research-article

Open Access

Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning

SIGGRAPH '22: ACM SIGGRAPH 2022 Conference ProceedingsArticle No.: 12, Pages 1–8https://doi.org/10.1145/3528233.3530736

In this work, we tackle the challenging problem of arbitrary image style transfer using a novel style feature representation learning method. A suitable style representation, as a key component in image stylization tasks, is essential to achieve ...

research-article

A Comparative Study of CNN- and Transformer-Based Visual Style Transfer

Journal of Computer Science and Technology (JCST), Volume 37, Issue 3Pages 601–614https://doi.org/10.1007/s11390-022-2140-7

Abstract

Vision Transformer has shown impressive performance on the image classification tasks. Observing that most existing visual style transfer (VST) algorithms are based on the texture-biased convolution neural network (CNN), here raises the question ...

research-article

Distribution Aligned Multimodal and Multi-domain Image Stylization

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 17, Issue 3Article No.: 96, Pages 1–17https://doi.org/10.1145/3450525

Multimodal and multi-domain stylization are two important problems in the field of image style transfer. Currently, there are few methods that can perform multimodal and multi-domain stylization simultaneously. In this study, we propose a unified ...

research-article

Open Access

Destylization of text with decorative elements

MMAsia '20: Proceedings of the 2nd ACM International Conference on Multimedia in AsiaArticle No.: 14, Pages 1–7https://doi.org/10.1145/3444685.3446324

Style text with decorative elements has a strong visual sense, and enriches our daily work, study and life. However, it introduces new challenges to text detection and recognition. In this study, we propose a text destylized framework, that can ...

research-article

Exploring the Representativity of Art Paintings

IEEE Transactions on Multimedia (TOM), Volume 23Pages 2794–2805https://doi.org/10.1109/TMM.2020.3016887

Art painting evaluation is sophisticated for a novice with no or limited knowledge on art criticism, and history. In this study, we propose the concept of <italic>representativity</italic> to evaluate paintings instead of using professional concepts, such ...

Search Results

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Caption

Dance-to-Music Generation with Encoder-based Textual Inversion

Identity-Preserving Face Swapping via Dual Surrogate Generative Models

Revealing the two sides of data augmentation: an asymmetric distillation-based win-win solution for open-set recognition

E-commerce Retail Merchandise Based on Optimized K-means Algorithm and Multi-model Fusion Demand Forecasting Research

Music style transfer with time-varying inversion of diffusion models

<inline-formula><tex-math notation="LaTeX">${A^{2}Pt}$</tex-math></inline-formula>: Anti-Associative Prompt Tuning for Open Set Visual Recognition

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning

CrossRectify: Leveraging disagreement for semi-supervised object detection

Semantic-Context Graph Network for Point-Based 3D Object Detection

SPA<sup>2</sup>Net: Structure-Preserved Attention Activated Network for Weakly Supervised Object Localization

SMNet: Synchronous Multi-Scale Low Light Enhancement Network With Local and Global Concern

Language-driven Diversified Image Retargeting

Bias oriented unbiased data augmentation for cross-bias representation learning

Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion

Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning

A Comparative Study of CNN- and Transformer-Based Visual Style Transfer

Distribution Aligned Multimodal and Multi-domain Image Stylization

Destylization of text with decorative elements

Exploring the Representativity of Art Paintings

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder