This repo holds the Pytorch implementation of PairAug:
[CVPR2024] PairAug: What Can Augmented Image-Text Pairs Do for Radiology?
- Create a new conda environment
conda create --name pairaug python=3.10
source activate pairaug
- Clone this repo
git clone https://github.com/YtongXie/PairAug.git
cd PairAug
- Install packages for image generation
pip install -r requirements_T2I.txt
-
Download MIMIC-CXR-JPG dataset (Need to be a credentialed user for downloading)
-
Put the image data under
data/MIMIC_images_ori/
should be like:
├── p19_p19995997_s50123635_6fa953ea-79c237a5-4ca3be78-e3ae6427-c327e17b.png
├── p19_p19996061_s58482960_87923de8-5595ad44-eaa89d38-610e97e2-42cacf04.png
├── ...
- Run
python data/reports_jsonl.py
to generate the report data listsdata/metadata_train.jsonl
. - Put the report data under
data/MIMIC_reports_ori/
should be like:
├── p19_p19995997_s50123635_6fa953ea-79c237a5-4ca3be78-e3ae6427-c327e17b.txt
├── p19_p19996061_s58482960_87923de8-5595ad44-eaa89d38-610e97e2-42cacf04.txt
├── ...
- Download pretrained RoentGen weights and put it under
pretrained/
. Need to contact authors of paper RoentGen for download. - Run
python InterAug_Step1.py --start_index 0 --end_index 2200000
It aims to generate new patient reports via ChatGPT, and save them in data/InterAug/InterAug_reports
.
- Run
python InterAug_Step2_T2I.py --start_index 0 --end_index 2200000
It aims to generate inter-patient images via RoentGen model, and save them in data/InterAug/InterAug_images
.
- Run
python IntraAug_Step1.py --start_index 0 --end_index 2200000
It aims to generate intra-patient reports via ChatGPT, and save them in data/IntraAug/IntraAug_reports
.
- Run
python IntraAug_Step2_T2I.py --start_index 0 --end_index 2200000
It aims to generate intra-patient images based on generated and original reports, and save them in data/IntraAug/IntraAug_images
.
- Install packages for data pruning
pip install -r requirements_MedClip.txt
- Run
python InterAug_Step3_fliter.py
It introduces a data pruning method w.r.t. the semantic alignment between generated InterAug image-report pairs to ensure the quality of the generated pairs.
- Run
python IntraAug_Step3_fliter.py
It introduces a hybrid consistency score between generated IntraAug image-report pairs to ensure the quality of the generated pairs.
Thanks to diffusers for the latent diffusion model, prompt-to-prompt for prompt-to-prompt stable diffusion model, RoentGen for pretrained weights, CheXzero for medical visual-language pre-training.
Yutong Xie (yutong.xie678@gmail.com)