Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
CortexaDB LogoCortexaDB
API Reference

Python API

Complete Python API reference

Complete reference for the CortexaDB Python package.

CortexaDB

The main database class.

CortexaDB.open(path, **kwargs)

Opens or creates a database at the specified path.

Parameters:

ParameterTypeDefaultDescription
pathstrRequiredDatabase directory path
dimensionint?NoneVector dimension. Required if no embedder is set
embedderEmbedder?NoneEmbedding provider for auto-embedding
syncstr"strict"Sync policy: "strict", "async", or "batch"
max_entriesint?NoneMaximum number of memories before eviction
max_bytesint?NoneMaximum storage size in bytes before eviction
index_modestr | dict"exact""exact", "hnsw", or HNSW config dict
recordstr?NonePath to replay log file for recording

Returns: CortexaDB

Example:

from cortexadb import CortexaDB
from cortexadb.providers.openai import OpenAIEmbedder

# With embedder
db = CortexaDB.open("agent.mem", embedder=OpenAIEmbedder())

# With manual dimension
db = CortexaDB.open("agent.mem", dimension=128, sync="batch")

# With HNSW indexing
db = CortexaDB.open("agent.mem", dimension=128, index_mode={
    "type": "hnsw", "m": 16, "ef_search": 50, "metric": "cos"
})

CortexaDB.replay(log_path, db_path, **kwargs)

Rebuilds a database from a replay log file.

Parameters:

ParameterTypeDefaultDescription
log_pathstrRequiredPath to the replay log file
db_pathstrRequiredPath for the new database
syncstr"strict"Sync policy for the new database
strictboolFalseIf True, raises on first error. If False, skips bad operations

Returns: CortexaDB

Example:

db = CortexaDB.replay("session.log", "restored.mem", strict=False)
report = db.last_replay_report

Memory Operations

.remember(text, embedding=None, metadata=None)

Stores a new memory entry. If an embedder is configured and no embedding is provided, the text is auto-embedded.

Parameters:

ParameterTypeDefaultDescription
textstrRequiredText content to store
embeddinglist[float]?NonePre-computed embedding vector
metadatadict[str, str]?NoneKey-value metadata pairs

Returns: int - The assigned memory ID

Example:

mid = db.remember("User prefers dark mode")
mid = db.remember("text", metadata={"source": "onboarding"})
mid = db.remember("text", embedding=[0.1, 0.2, ...])

.ask(query, embedding=None, top_k=5, use_graph=False, recency_bias=False)

Performs a hybrid search across the database.

Parameters:

ParameterTypeDefaultDescription
querystrRequiredSearch query text
embeddinglist[float]?NonePre-computed query embedding
top_kint5Number of results to return
use_graphboolFalseEnable graph expansion via BFS
recency_biasboolFalseBoost recent memories in scoring

Returns: list[Hit]

Example:

hits = db.ask("What does the user prefer?")
hits = db.ask("query", top_k=10, use_graph=True, recency_bias=True)

for hit in hits:
    print(f"ID: {hit.id}, Score: {hit.score:.3f}")

.get_memory(mid)

Retrieves a full memory entry by ID.

Parameters:

ParameterTypeDescription
midintMemory ID

Returns: Memory

Raises: CortexaDBNotFoundError if the memory doesn't exist.

Example:

mem = db.get_memory(42)
print(mem.id)          # 42
print(mem.content)     # b"User prefers dark mode"
print(mem.namespace)   # "default"
print(mem.metadata)    # {"source": "onboarding"}
print(mem.created_at)  # 1709654400
print(mem.importance)  # 0.0
print(mem.embedding)   # [0.1, 0.2, ...] or None

.delete_memory(mid)

Permanently deletes a memory and updates all indexes.

Parameters:

ParameterTypeDescription
midintMemory ID to delete

Raises: CortexaDBNotFoundError if the memory doesn't exist.

Example:

db.delete_memory(42)

Graph Operations

.connect(from_id, to_id, relation)

Creates a directed edge between two memories.

Parameters:

ParameterTypeDescription
from_idintSource memory ID
to_idintTarget memory ID
relationstrRelationship label

Example:

db.connect(1, 2, "relates_to")
db.connect(1, 3, "caused_by")

Both memories must be in the same namespace. Cross-namespace edges are forbidden.


.get_neighbors(mid)

Returns all outgoing edges from a memory.

Parameters:

ParameterTypeDescription
midintMemory ID

Returns: list[Edge] where each Edge has to (int) and relation (str) fields.

Example:

neighbors = db.get_neighbors(1)
for edge in neighbors:
    print(f"→ {edge.to} ({edge.relation})")

Document Ingestion

.ingest(text, strategy="recursive", chunk_size=512, overlap=50, metadata=None, namespace=None)

Chunks text and stores each chunk as a memory.

Parameters:

ParameterTypeDefaultDescription
textstrRequiredText to chunk and store
strategystr"recursive"Chunking strategy
chunk_sizeint512Target chunk size in characters
overlapint50Overlap between chunks
metadatadict?NoneMetadata to attach to all chunks
namespacestr?NoneTarget namespace

Returns: list[int] - Memory IDs of stored chunks


.load(file_path, strategy="markdown", chunk_size=512, overlap=50, metadata=None, namespace=None)

Loads a file, chunks it, and stores each chunk.

Parameters:

ParameterTypeDefaultDescription
file_pathstrRequiredPath to the file
strategystr"markdown"Chunking strategy
chunk_sizeint512Target chunk size
overlapint50Overlap between chunks
metadatadict?NoneMetadata for all chunks
namespacestr?NoneTarget namespace

Supported formats: .txt, .md, .json, .docx (requires cortexadb[docs]), .pdf (requires cortexadb[pdf])

Example:

db.load("README.md", strategy="markdown")
db.load("paper.pdf", strategy="recursive", chunk_size=1024)

.ingest_document(text, chunk_size=512, overlap=50, metadata=None, namespace=None)

Legacy method for chunking and storing text. Uses fixed chunking.


Namespace

.namespace(name, readonly=False)

Returns a scoped view of the database for a specific namespace.

Parameters:

ParameterTypeDefaultDescription
namestrRequiredNamespace name
readonlyboolFalseIf True, write operations raise errors

Returns: Namespace

Example:

ns = db.namespace("agent_a")
mid = ns.remember("text")
hits = ns.ask("query")
ns.delete_memory(mid)
ns.ingest_document("long text")

Maintenance Operations

.compact()

Reclaims disk space by removing deleted entries from segment files.

.flush()

Forces all pending writes to be synced to disk.

.checkpoint()

Creates a binary snapshot of the current state and truncates the WAL. Also saves the HNSW index if using HNSW mode.

.stats()

Returns database statistics.

Returns: Stats

stats = db.stats()
print(stats.entries)              # Total memory count
print(stats.indexed_embeddings)   # Embeddings in vector index
print(stats.wal_length)           # WAL file size in bytes
print(stats.vector_dimension)     # Configured vector dimension
print(stats.storage_version)      # Storage format version

Replay Properties

.last_replay_report

Diagnostic report from the most recent replay() call.

Type: dict

KeyTypeDescription
total_opsintTotal operations in the log
appliedintSuccessfully applied
skippedintSkipped (malformed)
failedintFailed (execution error)
op_countsdictPer-type counts
failureslistUp to 50 failure details

.last_export_replay_report

Diagnostic report from the most recent export_replay() call.

Type: dict

KeyTypeDescription
exportedintMemories written
skipped_missing_embeddingintEntries without vectors
skipped_missing_idintGaps in ID space
errorslistUnexpected errors

Export

.export_replay(path)

Exports the current database state as a replay log.

Parameters:

ParameterTypeDescription
pathstrOutput file path

Types

Hit

Query result from .ask().

FieldTypeDescription
idintMemory ID
scorefloatRelevance score (0.0 to 1.0)

Memory

Full memory entry from .get_memory().

FieldTypeDescription
idintMemory ID
namespacestrNamespace name
contentbytesRaw content
embeddinglist[float]?Vector embedding
metadatadict[str, str]Key-value metadata
created_atintUnix timestamp
importancefloatImportance score

Stats

Database statistics from .stats().

FieldTypeDescription
entriesintTotal memory count
indexed_embeddingsintEmbeddings in index
wal_lengthintWAL size in bytes
vector_dimensionintVector dimension
storage_versionintFormat version

ChunkResult

Result from chunk().

FieldTypeDescription
textstrChunk content
indexintZero-based index
metadatadict?Optional metadata

Standalone Functions

chunk(text, strategy="recursive", chunk_size=512, overlap=50)

Chunks text without storing it.

Returns: list[ChunkResult]

from cortexadb import chunk

chunks = chunk("Long text...", strategy="recursive", chunk_size=512, overlap=50)

Exceptions

ExceptionDescription
CortexaDBErrorBase exception for all CortexaDB errors
CortexaDBNotFoundErrorMemory or file not found
CortexaDBConfigErrorInvalid configuration
CortexaDBIOErrorI/O failure