ChatPilot: Intelligent Agentic RAG & Autonomous Research Engine

ChatPilot is a tool-augmented Agentic Copilot that goes beyond simple chat. It utilizes a single AI agent to interact with your local files, conduct deep web research, and perform autonomous data analysis through Python code execution.

Demo Frontend: ChatPilot Frontend

Key Capabilities

Document Intelligence & Agentic Retrieval

Powered by Docling, ChatPilot supports both digital and scanned documents with layout-aware parsing and OCR.

Multi-keyword and clause-level retrieval
Dynamic relevance re-evaluation by the agent
Deep context retrieval when shallow chunks are insufficient
Coordinate-based highlighting of retrieved content directly in PDFs

Scholarly Research & Visual RAG

ChatPilot includes a research-grade scholarly retrieval pipeline focused on traceability.

Integrated with OpenAlex (250M+ scholarly works)
Natural language queries over academic literature
Query decomposition RAG for higher recall and precision
Retrieved results are reranked using semantic similarity

Source Highlighting

All sources used by the LLM are returned with each response
Each retrieved chunk is mapped to its exact location in the source document
Referenced text is highlighted directly in the PDF viewer
Multiple sources are surfaced and highlighted independently

This allows direct verification of claims without manually searching papers.

Autonomous Data Analysis

ChatPilot functions as a data analyst agent for structured datasets.

Upload large CSV or Excel files (no file-size limits)
The LLM never accesses row-level data
Python code is generated using schema-level context only
Code executes inside an isolated Docker sandbox

Agent loop:

Plan → Execute → Error analysis → Retry until success

All data remains local.

Agentic Web Research

Integration with Tavily or Google Custom Search enables high-fidelity information gathering:

Deep Research: Conducts multi-step, agentic research loops to generate thorough and structured reports—ideal for deep competitor analysis and market trends.
Real-Time Knowledge: Accesses the latest news and specialized finance/news topics.
Web Fetch: Extracts and cleans content from URLs for use as context in analysis tasks.

Parallel Tool Execution

ChatPilot supports parallel tool calls within a single agent step, allowing multiple retrieval or research tools to run concurrently. This reduces latency, enables multi-source context gathering, and improves efficiency while preserving a single-agent architecture.

Multi-LLM Flexibility

Built on LangChain, ChatPilot supports switching between top providers. Use the same agent with your preferred model provider:

OpenAI
Anthropic
DeepSeek
Google (Gemini)
Local models via Ollama

Tech Stack

Backend: FastAPI (Python 3.10+)
Server: Uvicorn (ASGI server with multi-worker support)
Agent Framework: LangChain (Single-agent architecture with parallel tool execution)
Data Agent: LangGraph (For iterative code execution and self-correction)
Vector Store: ChromaDB
Embeddings & Reranking:
- Local Option: HuggingFace embeddings (BAAI BGE-small-en-v1.5) + BGE reranker-v2-m3 (fully offline)
- Cloud Option: Cohere API (1024D embeddings with reranking)
Containerization: Docker (Secure code execution sandbox)
Document Processing: Docling (Advanced document intelligence with OCR and bbox extraction)
Web Search: Tavily API, Google Custom Search
Scholar Research: OpenAlex API
Code Execution: Isolated Docker sandbox with resource limits

Getting Started

For detailed setup instructions including prerequisites, API key configuration, and Docker setup, see docs/SETUP.md.

Quick Start:

# 1. Clone and install dependencies
git clone https://github.com/Sreehari05055/ChatPilot.git
cd ChatPilot
pip install -r requirements.txt

# 2. Set up your .env file with API keys (see docs/SETUP.md)

# 3. Build Docker sandbox
docker build -t python-sandbox:latest -f Dockerfile .

# 4. Run the server
python main.py

Server runs on http://localhost:8000 with 4 worker processes.

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
app		app
docs		docs
scripts		scripts
source_files		source_files
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
asgi.py		asgi.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatPilot: Intelligent Agentic RAG & Autonomous Research Engine

Key Capabilities

Document Intelligence & Agentic Retrieval

Scholarly Research & Visual RAG

Source Highlighting

Autonomous Data Analysis

Agentic Web Research

Parallel Tool Execution

Multi-LLM Flexibility

Tech Stack

Getting Started

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ChatPilot: Intelligent Agentic RAG & Autonomous Research Engine

Key Capabilities

Document Intelligence & Agentic Retrieval

Scholarly Research & Visual RAG

Source Highlighting

Autonomous Data Analysis

Agentic Web Research

Parallel Tool Execution

Multi-LLM Flexibility

Tech Stack

Getting Started

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages