research-article

Text-to-Image algorithm Based on Fusion Mechanism

Authors:

Jiangtao LiuAuthors Info & Claims

RICAI '22: Proceedings of the 2022 4th International Conference on Robotics, Intelligent Control and Artificial Intelligence

Pages 841 - 847

https://doi.org/10.1145/3584376.3584526

Published: 19 April 2023 Publication History

Abstract

A multi-step text-generated image algorithm based on feature fusion is proposed to solve the problems of image blurring and missing image details in text-generated images. To enhance the ability of single-channel text features to guide multi-channel image features, the migration of text features to image feature is implemented using a feature fusion module that enhances text features while refining the details of the generated images. The fusion module is executed alternately with the upsampling operation in the generator to enhance the frequency of text feature usage. Placing the generator and discriminator is three pairs achieves the goal of generating fuzzy images from text to clear large images from fuzzy images.The experimental data of the model on the CUB dataset show an improvement in the Inception Score as well as the Frechet Inception Distance score, and the comparative analysis of the generated images shows that the images generated by this method have rich detail texture and sharpness.

References

[1]

Goodfellow I, Pouget-Abadie J, Mirza M, Generative Adversarial Nets [C] Neural Information Processing Systems. MIT Press, 2014.

[2]

Mirza M, Osindero S. Conditional Generative Adversarial Nets [J]. Computer Science, 2014, 2672-2680.

[3]

Radford A, Metz L, Chintala S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [J]. Computer ence, 2015.

[4]

Zhang H, Xu T, Li H, StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks [J]. IEEE, 2017.

[5]

Tao X, Zhang P, Huang Q, AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [J]. 2017.

[6]

Han Z, Xu T, Li H, StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, PP(99):1-1.

[7]

Zhu J Y, Park T, Isola P, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks [J]. 2017 IEEE International Conference on Computer Vision (ICCV), 2017.

[8]

Tsue, Sen S, Li J. Cycle Text-To-Image GAN with BERT [J]. 2020.

[9]

Zhu M, Pan P, Chen W, DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis [J]. 2019.

[10]

Tan H, Liu X, Liu M, KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis [J]. IEEE Transactions on Image Processing, 2020, PP(99).

[11]

Liu Y, MD Nadai, Cai D, Describe What to Change: A Text-guided Unsupervised Image-to-image Translation Approach [C] MM '20: The 28th ACM International Conference on Multimedia. ACM, 2020.

[12]

Tan H, Liu X, Yin B, DR-GAN: Distribution Regularization for Text-to-Image Generation [J]. arXiv e-prints, 2022.

[13]

Qiao T, Zhang J, Xu D, MirrorGAN:Learning Text-to-image Generation by Redescription [C] 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.

[14]

Zhang H, Goodfellow I, Metaxas D, Self-Attention Generative Adversarial Networks [J]. 2018.

[15]

Liao W, Hu K, Yang M Y, Text to Image Generation with Semantic-Spatial Aware GAN [J]. 2021.

[16]

Shulan Ruan,Yong Zhang, Kun Zhang, Yanbo Fan, Fan Tang, Qi Liu, Enhong Chen. DAEGAN: dynamic aspect-aware GAN for text-to-image synthesis. [J] CoRR. Ruan S, Zhang Y, Zhang K, Dae-gan: Dynamic aspect-aware gan for text-to-image synthesis[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 13960-13969.

[17]

Zhang Z, Schomaker L. DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation [C]// International Joint Conference on Neural Network. IEEE, 2021.

[18]

Ye H, Yang X, Takac M, Improving Text-to-Image Synthesis Using Contrastive Learning [J]. 2021.

[19]

Hassani K, Khasahmadi A H. Contrastive Multi-View Representation Learning on Graphs [C]// 2020.

[20]

Misra I, Maaten L. Self-Supervised Learning of Pretext-Invariant Representations [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020.

[21]

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila. Analyzing and improving the image quality of stylegan. [J]. Conference on Computer Vision and Pattern Recongnition. 2020, 8110-8119.

[22]

Tao M, Tang H, Wu S, DF-GAN: Deep Fusion Generative Adversarial Networks for Text-to-Image Synthesis [J]. 2020.

[23]

Tero Karras, Samuli Laine, Timo AILA. A style-based generator architecture for generative adversarial networks. [J]. IEEE Transmit Pattern Anal., 2021, 43: 41, 217-4228.

[24]

Ye S, Liu F, Tan M. Recurrent Affine Transformation for Text-to-image Synthesis [J]. 2022.

[25]

Saharia C, Chan W, Saxena S, Non-Autoregressive Machine Translation with Latent Alignments [J]. 2020.

[26]

Gafni O, Polyak A, Ashual O, Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors [J]. 2022.

[27]

Avrahami O, Lischinski D, Fried O. Blended Diffusion for Text-driven Editing of Natural Images [J]. 2021.

[28]

Liu X, Gong C, Wu L, FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization [J]. arXiv e-prints, 2021.

[29]

Gu S, Chen D, Bao J, Vector Quantized Diffusion Model for Text-to-Image Synthesis [J]. arXiv e-prints, 2021.

Recommendations

An image fusion dehazing algorithm based on dark channel prior and retinex

The dehazing algorithm for image processing is widely used to improve image quality, which aims to reduce the impact of hazy weather. The typical dehazing algorithm using dark channel prior (DCP) can carry out image dehazing efficiently. However, the ...
Hybrid Image Fusion Algorithm Using Laplacian Pyramid and PCA Method
ICTCS '16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies

Image fusion is a procedure of merging compatible information from two or more images in to one image. Such fused image will be more descriptive in nature or have sufficient information than the source images. One of the fusion techniques like DWT ...
Image dehazing algorithm based on artificial multi-exposure image fusion
Abstract
Bad weather conditions such as fog, haze, etc. reduce the visibility level of the images captured in those weather conditions. Degradation of visibility in the captured images is a serious concern nowadays. Even though there are many different ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

RICAI '22: Proceedings of the 2022 4th International Conference on Robotics, Intelligent Control and Artificial Intelligence

December 2022

1396 pages

ISBN:9781450398343

DOI:10.1145/3584376

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

RICAI 2022

RICAI 2022: 2022 4th International Conference on Robotics, Intelligent Control and Artificial Intelligence

December 16 - 18, 2022

Dongguan, China

Acceptance Rates

Overall Acceptance Rate 140 of 294 submissions, 48%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
37
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents