We have implemented a custom chatbot using Llamafile to interact with India's Budget 2024 speech in local CPU. Developed by Mozilla, Llamafile converts AI models into executables that can run on any machine, bypassing the need for high-cost servers like AWS and the constraints of frameworks like Langchain. This tutorial is written in raw Python, allowing for a streamlined setup without requiring GPU resources.
We have implemented the follwing steps in the chatbot-budget2024
notebook.
- Download and installation of the prerequisites.
- Reading the budget speech document from
.docx
format. - Splitting of the document into smaller chunks.
- Create embeddings using
TF-IDF
and use ofFAISS
vector store to create embedding index. - Execute FAISS's similarity search to find k nearest relevant contexts w.r.t the user query.
- Constuction of prompt using a q&a instruction, relevant contexts and the query.
- Run Llamafile as a server and chat with the local chatbot.
We have implemented the follwing steps in the chatbot-budget2024-colab
notebook.
- Download and installation of the prerequisites.
- Reading the budget speech document from
.docx
format. - Splitting of the document into smaller chunks.
- Create embeddings using
TF-IDF
and use ofFAISS
vector store to create embedding index. - Execute FAISS's similarity search to find k nearest relevant contexts w.r.t the user query.
- Constuction of prompt using a q&a instruction, relevant contexts and the query.
- Run Llamafile in free Google Colab GPU using
Subprocess
and chat with the GPU chatbot.