Overview

NoPokeDB is a minimalist vector database that combines the speed of hnswlib's approximate nearest neighbor search with the reliability of SQLite for metadata storage. It's designed to be simple, durable, and fast—everything you need for vector search without the complexity of enterprise solutions.

With over 2,000 PyPI downloads, NoPokeDB has proven useful for developers who need a quick, no-fuss vector database for embeddings, semantic search, or recommendation systems.

Key Features

  • Durable: HNSW index + SQLite metadata, fully persisted on disk
  • Crash Safety: Write-ahead operation log with automatic replay on restart
  • Fast: Batch inserts, bulk metadata fetch, auto-resize index for performance
  • Flexible Metrics: Supports cosine similarity, L2 distance, and inner product
  • CRUD Operations: add, add_many, get, delete, upsert—everything you need
  • Consistent Scoring: Higher scores = better matches across all metrics

Installation

pip install nopokedb

Quick Start

import numpy as np
from nopokedb import NoPokeDB

# Create (or load existing) DB with 128-d vectors
db = NoPokeDB(dim=128, max_elements=10_000, path="./vdb_data", space="cosine")

# Insert one vector
vec = np.random.rand(128).astype(np.float32)
vid = db.add(vec, metadata={\"name\": \"foo\"})

# Insert many at once (faster)
V = np.random.rand(5, 128).astype(np.float32)
metas = [{\"i\": i} for i in range(len(V))]
ids = db.add_many(V, metas)

# Query nearest neighbors
q = np.random.rand(128).astype(np.float32)
hits = db.query(q, k=3)
for h in hits:
    print(h["id"], h["score"], h["metadata"])

# Get / update / delete
print(db.get(vid))
db.upsert(vid, metadata={\"name\": \"bar\"})
db.delete(ids[0])

# Persist & close
db.save()
db.close()

Example Query Result

{
  'id': 0,
  'metadata': {\"name\": \"foo\"},
  'score': 0.991,
  'distance': 0.009
}

Architecture

HNSW Index
Uses hnswlib for fast approximate nearest neighbor search. Hierarchical Navigable Small World graphs provide excellent performance with configurable precision-speed tradeoffs.
SQLite Metadata
Stores associated metadata with ACID guarantees. Lightweight, embedded, and perfect for small to medium-sized vector collections without external dependencies.
Write-Ahead Log
Operations are logged before execution with fsync. If a crash occurs, the oplog replays automatically on restart, ensuring no data loss.
Auto-Resize
Automatically expands the HNSW index capacity as needed. Start small and grow organically without manual intervention.

API Reference

Initialization

db = NoPokeDB(
    dim=128,                  # Vector dimensionality
    max_elements=10_000,      # Initial capacity (auto-resizes)
    path="./vdb_data",        # Storage directory
    space="cosine",           # Distance metric: cosine, l2, ip
    M=16,                     # HNSW graph connectivity
    ef_construction=200,      # Index build quality
    ef=50                     # Query quality
)

Core Operations

add(vector, metadata)
Insert a single vector with metadata. Returns the assigned ID.
add_many(vectors, metadatas)
Batch insert multiple vectors. Much faster than repeated add() calls.
query(vector, k=10)
Find k nearest neighbors. Returns list of results with id, score, distance, metadata.
get(id)
Retrieve metadata for a specific vector ID.
upsert(id, metadata)
Update metadata for an existing vector.
delete(id)
Remove a vector and its metadata from the database.
save()
Persist the HNSW index to disk. Called automatically on close().
close()
Save and close the database connection.

Use Cases

  • Semantic Search: Find similar documents or passages using text embeddings
  • Recommendation Systems: Content-based recommendations with item embeddings
  • Image Similarity: Visual search using image embeddings from CNNs
  • RAG Applications: Vector store for retrieval-augmented generation pipelines
  • Prototyping: Quick vector search setup without Docker or cloud services
  • Side Projects: Lightweight alternative to Pinecone, Weaviate, or Milvus

Distance Metrics

Cosine Similarity
space="cosine"
Measures angle between vectors. Perfect for text embeddings and normalized features.
L2 Distance
space="l2"
Euclidean distance. Good for spatial coordinates and absolute magnitude differences.
Inner Product
space="ip"
Dot product similarity. Useful for pre-normalized embeddings and linear models.

Why NoPokeDB?

No Server Required
Embedded database—no Docker containers, no API servers, just import and use.
💾
Persistent Storage
Everything saved to disk with crash recovery. Restart your app without losing data.
🎯
Simple API
Just a handful of methods. No complex configuration, no YAML files.
🚀
Fast Enough
HNSW provides sub-linear search time. Great for thousands to millions of vectors.

Project Stats

2,000+
PyPI Downloads
~500
Lines of Code
Zero
External Services