GGUF Loader

🎉 NEW: Agentic Mode Now Available! Transform your local AI into an autonomous coding assistant that can read, create, edit, and organize files in your workspace. Perfect for automating development tasks, generating boilerplate code, and managing complex workflows - all running privately on your machine. Learn more about Agentic Mode →

A beginner-friendly, privacy-first desktop application for running large language models locally on Windows, Linux, and macOS. Load and chat with GGUF format models like Mistral, LLaMA, DeepSeek, and others with zero setup required.

🚀 Quick Start

Option 1: Windows Executable (Easiest - Recommended)

Step 1: Download the App

Direct Download: GGUFLoader_v2.1.1.agentic_mode.exe (~150-300 MB)

Step 2: Run the App

Click the downloaded GGUFLoader_v2.1.1.agentic_mode.exe file
Windows may show a security warning - click "More info" then "Run anyway" (this is normal for new apps)
The app will start automatically - no installation needed!

Step 3: Download a Model

Visit Local AI Zone for curated model recommendations
Or browse Hugging Face for thousands of GGUF models
Save it anywhere on your computer (e.g., Downloads folder)

Step 4: Load the Model

In GGUF Loader, click "Load Model" button
Browse to where you saved your GGUF model file
Select the model and click "Open"
Wait for the model to load (progress bar will show)

Step 5: Start Chatting!

Look for the floating chat button on your screen
Click it to open the chat window
Type your message and press Ctrl+Enter or click "Send"
Enjoy your private, local AI assistant!

Option 2: Install via pip

pip install ggufloader
ggufloader

Option 3: Run from Source (No Installation Required)

💡 Easy method - No coding knowledge needed!

Step 1: Download the ZIP file

Click here: Download ZIP
Save it anywhere on your computer

Step 2: Extract the ZIP file

Right-click on the downloaded ZIP file
Select "Extract All..." (Windows) or "Extract Here" (Linux/macOS)
Choose where to extract it

Step 3: Run the launcher

For Windows:

Open the extracted folder
Double-click on launch.bat
First time only: Wait 1-2 minutes while it downloads dependencies
The app will start automatically!
Next time: Just double-click launch.bat again - it starts instantly!

For Linux/macOS:

Open the extracted folder
Double-click on launch.sh (or right-click → Open)
First time only: Wait 1-2 minutes while it downloads dependencies
The app will start automatically!
Next time: Just double-click launch.sh again - it starts instantly!

That's it! No Python installation needed, no command line, no complicated setup.

✨ Features

🤖 Universal Model Support - Load ANY GGUF model from anywhere, not limited to pre-installed models
🔄 Zero-Setup Model Loading - Use any downloaded GGUF model instantly without configuration or conversion
🎨 Modern UI - Clean, intuitive interface built with PySide6
🔌 Powerful Addon System - Enhance functionality by creating custom addons without modifying core code
🌐 Floating Chat Button - Always-accessible chat interface that stays on top of all windows
🤖 Agentic Mode - Advanced reasoning and task automation with multi-step problem solving
🔒 Privacy First - All processing happens locally on your machine, no data leaves your computer
💻 Cross-Platform - Works seamlessly on Windows, Linux, and macOS
⚡ Lightweight & Fast - Efficient memory usage and quick response times
🎯 Beginner Friendly - No technical knowledge required, just download and run

📥 Download Models

Recommended Models

Mistral-7B Instruct (4.23 GB) - Recommended for Agentic Mode

⬇️ Download Q4_0
Excellent reasoning and task automation capabilities
Perfect for agentic workflows and multi-step problem solving
Fast inference with strong instruction following

GPT-OSS 20B (7.34 GB)

⬇️ Download Q4_K

LLaMA 3 8B Instruct (4.68 GB)

⬇️ Download Q4_0

More models available →

📚 Documentation

Quick Reference - Fast answers to common tasks
Installation Guide - Detailed setup instructions
User Guide - How to use GGUF Loader
Addon Development - Create your own addons
FAQ - Frequently asked questions
All Documentation - Complete documentation index

🤖 Agentic Mode

GGUF Loader now supports agentic mode, enabling the AI assistant to autonomously manage your workspace. The assistant can read, create, edit, and organize files within your project folder, automating development tasks and workflows.

What Agentic Mode Can Do

Read Files - Analyze code, documentation, and project structure
Create Files - Generate new source files, configs, and documentation
Edit Files - Modify existing code and update configurations
Organize Files - Create folders, move files, and restructure projects
Automate Tasks - Execute multi-step workflows without manual intervention

Getting Started with Agentic Mode

Load Mistral-7B (recommended for best results)
- Download from the models section above
- Load it in GGUF Loader
Enable Agentic Mode
- Open the chat window
- Select "Agentic Mode" from the settings
- Grant workspace access permissions
Example Tasks
- "Create a new feature module with proper structure"
- "Refactor this codebase and organize files"
- "Generate boilerplate code for a new component"
- "Update all configuration files with new settings"
- "Create documentation for this project"

Model Recommendations for Agentic Mode

Mistral-7B Instruct ⭐ Best choice - excellent reasoning, fast inference, perfect for code generation
LLaMA 3 8B - Strong reasoning and code understanding
GPT-OSS 20B - More powerful for complex refactoring tasks

🎬 Screenshot

🛠️ System Requirements

OS: Windows 10/11, Linux, or macOS
RAM: 4GB minimum (8GB recommended)
Storage: 2GB free space
GPU: Optional (CUDA/OpenCL support)

🚀 GPU Acceleration (Optional)

GGUF Loader supports GPU acceleration for significantly faster inference speeds. If you have an NVIDIA GPU, follow these steps:

Prerequisites

NVIDIA GPU (GTX 1060 or newer recommended)
CUDA Toolkit installed (CUDA 12.x recommended)
Latest NVIDIA drivers

Installation Steps

Step 1: Run the GPU installation script

Option A: Pre-built wheel (Recommended - Fastest)

# Windows
install_gpu_llama.bat

# Linux/macOS
chmod +x install_gpu_llama.sh
./install_gpu_llama.sh

Option B: Build from source (requires Visual Studio Build Tools)

# Windows
install_gpu_llama_source.bat

# Linux/macOS
chmod +x install_gpu_llama_source.sh
./install_gpu_llama_source.sh

Step 2: Verify GPU support

python verify_gpu_support.py

Step 3: Use GPU acceleration

Launch GGUF Loader
In the "Processing Mode" dropdown, select "GPU Accelerated"
Load your model - you'll see "(GPU)" in the status
Start chatting with GPU-accelerated inference!

Performance Tips

RTX 4060 (8GB): Can offload 25-40 layers depending on model size
RTX 3060 (12GB): Can offload 40-50 layers
RTX 4090 (24GB): Can offload entire models (60+ layers)

Monitoring GPU Usage

Run this in a separate terminal while using GGUF Loader:

# Windows
monitor_gpu.bat

# Linux/macOS
watch -n 1 nvidia-smi

Watch the "GPU-Util" column increase when generating responses - this confirms GPU acceleration is working!

Troubleshooting

"pip not recognized": The script will automatically activate your virtual environment
Slow speeds: Try increasing GPU layers in models/model_loader.py (default: 35)
Out of memory: Reduce GPU layers or use a smaller model
No speedup: Verify CUDA is installed with nvidia-smi

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

📄 License

This project is licensed under the MIT License - see LICENSE for details.

🔒 Security

Report security vulnerabilities to: hossainnazary475@gmail.com

See SECURITY.md for our security policy.

📞 Support

Built with ❤️ by the GGUF Loader community

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
addons		addons
agent_workspace		agent_workspace
build_hooks		build_hooks
core		core
docs		docs
mixins		mixins
models		models
test_workspace		test_workspace
ui		ui
updater		updater
widgets		widgets
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.MD		CODE_OF_CONDUCT.MD
CONTRIBUTING.md		CONTRIBUTING.md
LAUNCH_README.md		LAUNCH_README.md
LICENSE		LICENSE
MODEL_DETECTION_FIX_SUMMARY.md		MODEL_DETECTION_FIX_SUMMARY.md
MODEL_DETECTION_IMPROVEMENTS.md		MODEL_DETECTION_IMPROVEMENTS.md
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
QUICK_REFERENCE.md		QUICK_REFERENCE.md
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
SECURITY.md		SECURITY.md
__init__.py		__init__.py
addon_manager.py		addon_manager.py
build_exe.bat		build_exe.bat
build_exe.spec		build_exe.spec
config.py		config.py
feedback_config.example.json		feedback_config.example.json
float.png		float.png
gguf_loader_main.py		gguf_loader_main.py
icon.ico		icon.ico
install_gpu_llama.bat		install_gpu_llama.bat
install_gpu_llama_source.bat		install_gpu_llama_source.bat
launch.bat		launch.bat
launch.sh		launch.sh
launch_basic.bat		launch_basic.bat
launch_basic.sh		launch_basic.sh
main.py		main.py
monitor_gpu.bat		monitor_gpu.bat
requirements.txt		requirements.txt
resource_manager.py		resource_manager.py
screen.png		screen.png
streaming_usage_example.py		streaming_usage_example.py
utils.py		utils.py
validate_core_functionality.py		validate_core_functionality.py
verify_gpu_support.py		verify_gpu_support.py
verify_model_integration.py		verify_model_integration.py
write_agent_mixin.py		write_agent_mixin.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GGUF Loader

🚀 Quick Start

Option 1: Windows Executable (Easiest - Recommended)

Option 2: Install via pip

Option 3: Run from Source (No Installation Required)

✨ Features

📥 Download Models

Recommended Models

📚 Documentation

🤖 Agentic Mode

What Agentic Mode Can Do

Getting Started with Agentic Mode

Model Recommendations for Agentic Mode

🎬 Screenshot

🛠️ System Requirements

🚀 GPU Acceleration (Optional)

Prerequisites

Installation Steps

Performance Tips

Monitoring GPU Usage

Troubleshooting

🤝 Contributing

📄 License

🔒 Security

📞 Support

About

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

GGUF Loader

🚀 Quick Start

Option 1: Windows Executable (Easiest - Recommended)

Option 2: Install via pip

Option 3: Run from Source (No Installation Required)

✨ Features

📥 Download Models

Recommended Models

📚 Documentation

🤖 Agentic Mode

What Agentic Mode Can Do

Getting Started with Agentic Mode

Model Recommendations for Agentic Mode

🎬 Screenshot

🛠️ System Requirements

🚀 GPU Acceleration (Optional)

Prerequisites

Installation Steps

Performance Tips

Monitoring GPU Usage

Troubleshooting

🤝 Contributing

📄 License

🔒 Security

📞 Support

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages