Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

akashicode/kash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

41 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Go 1.25 Docker GraphRAG MIT License

REST Tested MCP Tested A2A Testing

⚑ Kash

Cache your knowledge. Channel the Akashic.
Compile your documents into an embedded GraphRAG brain. Ship AI agents as Docker images.

Agent as a Service Β· Knowledge as a Service

Kash /kæʃ/ — A double-entendre by design. Like a cache, it compiles your heavy knowledge into a fast, local binary you can ship anywhere. Like the Akashic records, it holds a complete, queryable record of everything you've fed it. One word. Two meanings. Zero infrastructure.


πŸ’‘ What is Kash?

Kash is a Go CLI that turns your raw documents (PDFs, Markdown, text files) into a self-contained AI agent packaged in a lightweight Docker container (~50MB).

No Python runtime. No external vector databases. No infrastructure headaches.

Your Documents  β†’  kash build  β†’  Docker Image  β†’  Ship Anywhere πŸš€

Think of it like a static site generator, but for AI brains. You compile knowledge at build time, and the runtime only serves queries β€” fast, lightweight, and portable.

The "Compiler" Approach

Traditional RAG Stack Kash
Python app + Pinecone + Redis + FastAPI Single Go binary + lightweight Docker image
Runtime document ingestion Build-time compilation
External vector DB dependency Embedded pure-Go vector store
Complex deployment docker run and done
$$$ infrastructure costs Runs on a Raspberry Pi

🎯 Use Cases

πŸ“– Expert Knowledge Agent

Feed your company docs, runbooks, or research papers. Get an AI that actually knows your stuff and cites sources.

Example: Internal engineering wiki β†’ Docker image β†’ every dev has a domain expert on tap.

πŸŽ“ Study / Exam Prep Agent

Compile textbooks and notes into a Socratic tutor that quizzes you, explains concepts, and never makes things up.

Example: UPSC prep material β†’ AI tutor β†’ study from anywhere.

πŸ› οΈ Product Support Agent

Turn your API docs, changelogs, and FAQs into a support bot that plugs into any chat UI or IDE.

Example: Docs + release notes β†’ Docker image β†’ mount in Open WebUI.

🀝 Multi-Agent Teams

Spin up multiple specialized agents (legal, finance, engineering) and wire them together via A2A protocol.

Example: Three domain agents β†’ CrewAI orchestration β†’ one smart team.


Scanned PDFs / OCR

Note: Scanned (image-only) PDF support (OCR) is not yet available. Kash currently extracts embedded/selectable text only.


⚑ Quick Start

5-Minute Setup

# 1. Install Kash (build from source β€” see "Building from Source" below)
go install github.com/akashicode/kash/cmd/kash@latest

# 2. Configure your API providers
mkdir -p ~/.kash
cat > ~/.kash/config.yaml << 'EOF'
build_providers:
  llm:
    base_url: "https://api.openai.com/v1"
    api_key: "sk-..."
    model: "gpt-4o"
  embedder:
    base_url: "https://api.voyageai.com/v1"
    api_key: "pa-..."
    model: "voyage-3"       # make sure the model dimensions match in agent.yaml in agent config
EOF

# 3. Scaffold a new agent
kash init my-expert

# 4. Add your knowledge
cp ~/docs/*.pdf my-expert/data/
cp ~/notes/*.md my-expert/data/

# 5. Compile the knowledge base
cd my-expert
kash build --dir /path/to/my-expert

# 6. Serve locally (no Docker needed!)
kash serve -d /path/to/my-expert

Your agent is now live at http://localhost:8000 with three interfaces ready to go.


πŸ—οΈ Architecture

flowchart TB
    subgraph BUILD["πŸ”¨ Build Time"]
        direction LR
        D["πŸ“„ Documents\nPDF / MD / TXT"] --> CK["Chunker"]
        CK --> EMB["Embedder API"]
        CK --> LLM1["LLM API\ntriple extraction"]
        EMB --> VDB["Vector DB\ndata/memory.chromem"]
        LLM1 --> GDB["Graph DB\ndata/knowledge.cayley"]
    end

    BUILD -- "docker build" --> RUNTIME

    subgraph RUNTIME["⚑ Runtime β€” port 8000"]
        direction LR
        Q["Query"] --> HS["Hybrid Search\nVector + Graph"]
        HS --> RR["Rerank\noptional"]
        RR --> LLM2["LLM"]
        LLM2 --> REST["REST API\nPOST /v1/chat/completions"]
        LLM2 --> MCP["MCP Server\nGET /mcp"]
        LLM2 --> A2A["A2A Protocol\nPOST /rpc/agent"]
    end
Loading

Core Stack

Component Technology Purpose
CLI Framework spf13/cobra Developer interface (init, build, serve)
Vector Memory philippgille/chromem-go Pure-Go embedded vector store
Graph Memory cayleygraph/cayley Embedded knowledge graph (triples)
LLM Client sashabaranov/go-openai Build-time extraction & runtime queries
MCP Protocol Model Context Protocol Tool exposure for Cursor / Windsurf / IDEs
A2A Protocol JSON-RPC Multi-agent orchestration (AutoGen, CrewAI)

Hybrid RAG Pipeline

Every query (REST, MCP, A2A) runs through the same pipeline:

flowchart LR
    Q["Query"] --> E["Embed"]
    Q --> K["Keywords"]
    E --> VS["Vector Search\nchromem-go"]
    K --> GT["Graph Traversal\ncayley"]
    VS --> M["Merge"]
    GT --> M
    M --> R["Rerank\noptional"]
    R --> C["Context β†’ LLM"]
Loading

πŸ–₯️ CLI Reference

kash init <name>

Scaffolds a new agent project.

kash init my-agent

Creates:

my-agent/
β”œβ”€β”€ data/               # Drop your PDFs, Markdown, TXT here
β”œβ”€β”€ agent.yaml          # Agent persona + config
β”œβ”€β”€ Dockerfile          # Ready for docker build
β”œβ”€β”€ docker-compose.yml  # One-command local deployment
β”œβ”€β”€ .env.example        # Runtime env var template
β”œβ”€β”€ .dockerignore       # Keeps images clean
└── README.md           # Auto-generated docs

kash build

Compiles documents into vector + graph databases.

kash build                     # in current directory
kash build --dir ./my-agent    # specify project dir
Flag Short Default Description
--dir -d . Project directory to build

Pipeline:

  1. Load documents from data/
  2. Chunk text into passages
  3. Generate vector embeddings β†’ data/memory.chromem/
  4. Extract knowledge graph triples β†’ data/knowledge.cayley/
  5. Auto-generate MCP tool descriptions β†’ agent.yaml

kash serve

Starts the runtime HTTP server.

kash serve                          # default: port 8000, ./agent.yaml
kash serve --port 9000              # custom port
kash serve --dir ./my-agent         # serve from specific directory
kash serve --agent custom.yaml      # custom agent config path
Flag Short Default Description
--port -p 8000 Listen port (overridden by PORT env var)
--agent -a agent.yaml Path to agent configuration
--dir -d . Project directory

kash version

kash version
# kash v1.0.0
#   commit:     a3f9c12
#   built:      2026-02-27T10:00:00Z
#   go version: go1.25.0
#   os/arch:    linux/amd64

πŸ”Œ Runtime Interfaces

All three interfaces serve concurrently on a single port.

REST API β€” POST /v1/chat/completions

Drop-in replacement for the OpenAI API. Intercepts requests, runs hybrid RAG, injects context, proxies to your LLM.

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Explain the key concepts"}]
  }'

Works with LibreChat, Open WebUI, AnythingLLM, and any OpenAI-compatible client.

MCP Server β€” GET /mcp

Model Context Protocol over HTTP SSE. Exposes your knowledge base as tools to IDEs.

{
  "mcpServers": {
    "my-agent": {
      "url": "http://localhost:8000/mcp"
    }
  }
}

Tested and working with Cursor and Windsurf.

A2A Protocol β€” POST /rpc/agent

JSON-RPC for multi-agent frameworks.

# Agent info
curl http://localhost:8000/rpc/agent \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"agent.info"}'

# Query knowledge
curl http://localhost:8000/rpc/agent \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":2,"method":"agent.query","params":{"query":"your question"}}'

πŸ§ͺ A2A protocol implementation is complete. Integration testing with AutoGen/CrewAI is in progress.


πŸ” Security β€” API Key Auth

By default all endpoints are open (ideal for local dev). Set AGENT_API_KEY to enable authentication on all endpoints except /health.

export AGENT_API_KEY="my-secret-key"
kash serve

The key is passed as a standard Bearer token β€” compatible with all three interfaces:

curl / any HTTP client

curl http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer my-secret-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}]}'

OpenAI Python / JS SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="my-secret-key",   # ← AGENT_API_KEY goes here
)
import OpenAI from 'openai';
const client = new OpenAI({
  baseURL: 'http://localhost:8000/v1',
  apiKey: 'my-secret-key',
});

MCP clients (Cursor, Claude Desktop, Windsurf)

{
  "mcpServers": {
    "my-agent": {
      "url": "http://localhost:8000/mcp",
      "env": {
        "API_KEY": "my-secret-key"
      }
    }
  }
}

A2A clients

curl http://localhost:8000/rpc/agent \
  -H "Authorization: Bearer my-secret-key" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"agent.info"}'

When AGENT_API_KEY is not set, everything works without any header (open access).


Health Check β€” GET /health

curl http://localhost:8000/health
{
  "status": "ok",
  "agent": "my-expert",
  "version": "1.0.0",
  "vectors": 892,
  "triples": 1423,
  "mcp_tools": 1,
  "embed_dimensions": 1024,
  "llm_model": "gpt-4o",
  "embed_model": "voyage-3",
  "reranker_enabled": false,
  "auth_enabled": true,
  "time": "2026-02-27T10:00:00Z"
}

/health is always public β€” no auth required even when AGENT_API_KEY is set.


πŸš€ Running Your Agent

Option 1: Local (No Docker)

Perfect for development and testing. Just build and serve directly:

# Set up providers
export LLM_BASE_URL="https://api.openai.com/v1"
export LLM_API_KEY="sk-..."
export LLM_MODEL="gpt-4o"
export EMBED_BASE_URL="https://api.voyageai.com/v1"
export EMBED_API_KEY="pa-..."

# Build the knowledge base
kash build

# Serve it
kash serve

That's it. Hit http://localhost:8000 and start chatting.

Option 2: Docker Compose (Recommended)

One command to build and run:

# Fill in your keys
cp .env.example .env
# edit .env with your API keys

# Build the knowledge base first
kash build

# Build image + run
docker compose up --build

Option 3: Docker Run (Manual)

# Build the image
docker build -t my-agent:latest .

# Run with env vars
docker run -p 8000:8000 \
  -e LLM_BASE_URL="https://api.openai.com/v1" \
  -e LLM_API_KEY="sk-..." \
  -e LLM_MODEL="gpt-4o" \
  -e EMBED_BASE_URL="https://api.voyageai.com/v1" \
  -e EMBED_API_KEY="pa-..." \
  -e AGENT_API_KEY="my-secret-key" \
  my-agent:latest

Option 4: Share With the World 🌍

Build a multi-arch image and push to any registry:

# Build for both x86 and ARM (runs on servers + Raspberry Pi)
docker buildx build --platform linux/amd64,linux/arm64 \
  -t ghcr.io/you/my-agent:v1 --push .

# Anyone can now run your agent with one command:
docker run -p 8000:8000 --env-file .env ghcr.io/you/my-agent:v1

Your agent is now a portable Docker image that anyone can pull and run. They just bring their own API keys.


βš™οΈ Configuration

Build-Time: ~/.kash/config.yaml

Used by kash build to call LLM and embedding APIs.

build_providers:
  llm:
    base_url: "https://api.openai.com/v1"    # or any OpenAI-compatible endpoint
    api_key: "sk-..."
    model: "gpt-4o"
  embedder:
    base_url: "https://api.voyageai.com/v1"
    api_key: "pa-..."
    model: "voyage-3"                          # optional if using a router
  # reranker:        # optional β€” must be Cohere-compatible (/rerank endpoint)
  #   base_url: "https://api.cohere.ai/v1"  # Cohere, Jina, Voyage, or a LiteLLM proxy
  #   api_key: "..."
  #   model: "rerank-english-v3.0"           # or jina-reranker-v2-base-en, rerank-1, etc.

Provider agnostic β€” works with any OpenAI-compatible endpoint. Use LiteLLM, Ollama, or TrueFoundry as a proxy.

Runtime: Environment Variables

Used by kash serve and Docker containers.

Variable Required Description
LLM_BASE_URL βœ… OpenAI-compatible LLM endpoint
LLM_API_KEY βœ… LLM API key
LLM_MODEL βœ… Model name (e.g. gpt-4o)
EMBED_BASE_URL βœ… Embedding API endpoint
EMBED_API_KEY βœ… Embedding API key
EMBED_MODEL ❌ Embedding model (optional if using a router)
RERANK_BASE_URL ❌ Reranker base URL β€” must expose a Cohere-compatible /rerank endpoint
RERANK_API_KEY ❌ Reranker API key
RERANK_MODEL ❌ Reranker model name (e.g. rerank-english-v3.0)
RERANK_ENDPOINT ❌ Full rerank URL override (e.g. https://gateway.example.com/v1/rerank) β€” takes priority over RERANK_BASE_URL
AGENT_API_KEY ❌ Enable auth β€” all endpoints (except /health) require Authorization: Bearer <key>
PORT ❌ Override listen port (default: 8000)

Agent Config: agent.yaml

Each project has an agent.yaml that defines persona, embedding dimensions, and MCP tools:

agent:
  name: "my-expert"
  version: "1.0.0"
  description: "An expert AI agent powered by Kash"
  system_prompt: |
    You are a highly knowledgeable expert assistant...

runtime:
  embedder:
    dimensions: 1024    # must match build AND serve time

mcp:
  tools:
    - name: "search_my_expert_knowledge"
      description: "Auto-generated by kash build"

server:
  port: 8000
  cors_origins: ["*"]

Important: The dimensions value is NOT sent to the embedding API β€” some providers don't support it. Kash handles truncation locally.


πŸ”¨ Building from Source

Prerequisites

Build

git clone https://github.com/akashicode/kash.git
cd Kash

# Build for your platform
go build -o bin/kash ./cmd/Kash

# Or use Make
make build

Cross-Compile

# Linux
GOOS=linux GOARCH=amd64 go build -o bin/kash-linux ./cmd/Kash

# macOS (Apple Silicon)
GOOS=darwin GOARCH=arm64 go build -o bin/kash-darwin ./cmd/Kash

# Windows
GOOS=windows GOARCH=amd64 go build -o bin/kash.exe ./cmd/Kash

# All platforms at once
make build-all

Install System-Wide

# Linux / macOS
sudo make install
# β†’ installs to /usr/local/bin/kash

# Windows (PowerShell as Admin)
Copy-Item bin\kash.exe C:\Windows\System32\kash.exe

πŸ§ͺ Development

make test         # Run all tests
make test-v       # Verbose output
make coverage     # Generate HTML coverage report
make fmt          # Format code
make vet          # Static analysis
make lint         # golangci-lint (install first)
make tidy         # go mod tidy
make clean        # Remove build artifacts

Project Layout

Kash/
β”œβ”€β”€ cmd/                          # CLI commands (Cobra)
β”‚   β”œβ”€β”€ kash/main.go       # Entry point
β”‚   β”œβ”€β”€ root.go                   # Root command + Viper config
β”‚   β”œβ”€β”€ init.go                   # kash init
β”‚   β”œβ”€β”€ build.go                  # kash build
β”‚   β”œβ”€β”€ serve.go                  # kash serve
β”‚   └── version.go                # kash version
β”œβ”€β”€ internal/
β”‚   β”œβ”€β”€ config/                   # Unified config (env + YAML)
β”‚   β”œβ”€β”€ display/                  # Colorful CLI output + banners
β”‚   β”œβ”€β”€ chunker/                  # Text chunking
β”‚   β”œβ”€β”€ reader/                   # Document loading (PDF, MD, TXT)
β”‚   β”œβ”€β”€ llm/                      # LLM client, embedder, reranker
β”‚   β”œβ”€β”€ vector/                   # chromem-go vector store
β”‚   β”œβ”€β”€ graph/                    # cayley knowledge graph
β”‚   └── server/                   # HTTP server (REST, MCP, A2A)
β”œβ”€β”€ Makefile
β”œβ”€β”€ Dockerfile                    # Base image (multi-arch)
└── go.mod

πŸ“Š Project Status

Feature Status Notes
kash init βœ… Stable Full project scaffolding
kash build βœ… Stable PDF, Markdown, TXT ingestion
kash serve βœ… Stable All three interfaces
REST API βœ… Tested Drop-in OpenAI replacement
MCP Server βœ… Tested Works with Cursor & Windsurf
A2A Protocol πŸ§ͺ In Progress Implementation done, testing pending
Hybrid RAG βœ… Stable Vector + Graph search
Reranker βœ… Optional Cohere-compatible rerank API (/rerank endpoint)
Multi-arch Docker βœ… Stable amd64 + arm64
Streaming responses βœ… Stable SSE streaming for REST API

🌟 Why Kash?

🧊 Zero Infrastructure β€” No Pinecone, no Redis, no PostgreSQL. Everything is embedded in a single binary.
🐳 Ship as Docker β€” Your agent is a lightweight image. Push to a registry and anyone can run it with docker run.
πŸ”‘ BYOM (Bring Your Own Model) β€” Works with OpenAI, Anthropic (via proxy), Ollama, LiteLLM, TrueFoundry β€” any OpenAI-compatible endpoint.
⚑ Fast β€” Go binary starts in <50ms. No Python cold starts. No dependency hell.
🧠 Hybrid RAG β€” Vector similarity + knowledge graph traversal. Better context than vector-only retrieval.
πŸ”Œ Three Interfaces β€” REST (any chat UI), MCP (IDEs), A2A (multi-agent). One build, three ways to connect.

πŸ“œ License

MIT β€” do whatever you want with it.


⚑ Kash
Cache your knowledge. Channel the Akashic. No infrastructure required.