Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
/ WSVM Public

Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition

Notifications You must be signed in to change notification settings

yc-zh/WSVM

Repository files navigation

WSVM: Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition

WSVM is a fully automated method using Vision Transformer (ViT) and Multiple Instance Learning (MIL) for tongue extraction and tooth-marked tongue recognition. It accurately detects the tongue region in clinical images and uses weakly supervised learning to identify tooth-marked areas with only image-level annotations. WSVM enhances the objectivity and accuracy of tongue diagnosis in Traditional Chinese Medicine (TCM).

Table of Contents

Installation

Prerequisites

  • Python 3.9.19
  • PyTorch 2.2.2, CUDA 12.3
  • Required Python packages (specified in environment.yaml)

Setup

Clone the repository:

git clone https://github.com/yc-zh/WSVM.git
cd WSVM

Create a virtual environment and install dependencies:

conda env create -f environment.yaml
conda activate WSVM

Dataset and Pre-trained Model

Dataset

Download the dataset from the following link: Tongue Image Dataset and put it in the data/tongue directory.

Pretrained Model

The pre-trained model weights are based on the deep-learning-for-image-processing. The weights can be downloaded from the following link: Pre-trained model and put it in the vit_weights directory. The extraction code is eu9f.

Getting Started

Training the Model

To train the model, run the following command:

python train.py

Testing the Model

To test the model, use:

python test.py

Project Structure

WSVM/
├── data/                     # Directory for storing datasets
├── models/                   # fine-tuned model files
├── tongue_extraction/        # Scripts for tongue foreground extraction
├── vision_transformer/       # Vision Transformer related code
├── vit_weights/              # Pre-trained ViT weights
├── environment.yaml          # Conda environment configuration file
├── README.md                 # Project documentation
├── test.py                   # Testing script
├── train.py                  # Training script
└── utils.py                  # Utility functions

Acknowledgements

Thanks deep-learning-for-image-processing, SAM and YOLOv8 for their public code and released models.

About

Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages