GitHub - Pranav-here/AI-Dermatologist: AI-Dermatologist is an AI-powered medical assistant that combines vision, speech-to-text, and text-to-speech capabilities to simulate a professional doctor–patient interaction. It enables users to record their voice, submit medical images, receive diagnostic insights, and hear spoken responses.

AI-Dermatologist

AI-Dermatologist is an AI-powered medical assistant that combines vision, speech-to-text, and text-to-speech capabilities to simulate a professional doctor–patient interaction. It enables users to record their voice, submit medical images, receive diagnostic insights, and hear spoken responses.

Screenshots

Key Features

Voice Interaction: Record patient audio via microphone and transcribe using Groq's Whisper model.
Image Analysis: Process and evaluate medical images (e.g., skin lesions) using a multimodal LLM (Meta LLaMA Scout).
AI-Generated Responses: Generate concise, doctor-style text responses without AI disclaimers or formatting.
Text-to-Speech: Convert the AI-generated diagnosis into natural-sounding speech via gTTS (Google) or ElevenLabs.
Web UI: Intuitive Gradio interface for seamless voice and image input, and playback of responses.

Architecture Overview

Audio Capture: User records their voice; saved as MP3 via speech_recognition and pydub.
Transcription: Transcribe audio to text using Groq Whisper (whisper-large-v3).
Prompt Assembly: Combine system prompt with transcription for context.
Image Encoding & Analysis: Base64-encode user image and query Meta LLaMA Scout via the Groq API.
Response Generation: Receive doctor-style advice from the LLM.
Speech Synthesis: Generate and play back spoken response using gTTS or ElevenLabs.
Web Interface: Gradio serves as the frontend for input/output.

Tools & Technologies

Python 3.8+
Gradio: Web UI framework for rapid prototyping of ML interfaces.
Groq SDK: For image and audio transcription API calls.
gTTS & ElevenLabs: Text-to-speech engines.
SpeechRecognition & pydub: Audio recording and format conversion.
dotenv: Environment variable management.
winsound / afplay / aplay: Cross-platform inline audio playback.

Installation

Clone the repository:

git clone https://github.com/pranav-here/AI-Dermatologist.git
cd AI-Dermatologist

Install dependencies:
```
pip install -r requirements.txt
```

Create a .env file with the following keys:

ELEVENLABS_API_KEY=<your_elevenlabs_api_key>
GROQ_API_KEY=<your_groq_api_key>

Usage

Launch the Gradio interface:

python gradio_app.py

Open the URL printed in your console (typically http://127.0.0.1:7860).

Step 1: Record your voice and/or upload an image.
Step 2: View the transcribed text and AI-generated diagnosis.
Step 3: Listen to the spoken response.

Screenshots

Requirements

All dependencies are listed in requirements.txt. Example:

grpcio
gradio
pydub
speechrecognition
gtts
elevenlabs
groq-sdk
dotenv

Refer to requirements.txt for the full list and exact versions.

License

This project is licensed under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.vscode		.vscode
MVP		MVP
__pycache__		__pycache__
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
Technical Architecture.jpg		Technical Architecture.jpg
Test1.png		Test1.png
Test2.png		Test2.png
acne.png		acne.png
app.py		app.py
brain_of_the_doctor.py		brain_of_the_doctor.py
dandruff-optimized.webp		dandruff-optimized.webp
final.mp3		final.mp3
final_doctor_response.mp3		final_doctor_response.mp3
gtts_testing.mp3		gtts_testing.mp3
gtts_testing_autoplay.mp3		gtts_testing_autoplay.mp3
patient_voice_test.mp3		patient_voice_test.mp3
requiremets.txt		requiremets.txt
skin_rash.jpg		skin_rash.jpg
voice_of_the_doctor.py		voice_of_the_doctor.py
voice_of_the_patient.py		voice_of_the_patient.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Dermatologist

Screenshots

Key Features

Architecture Overview

Tools & Technologies

Installation

Usage

Screenshots

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI-Dermatologist

Screenshots

Key Features

Architecture Overview

Tools & Technologies

Installation

Usage

Screenshots

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages