Semantic search allows you to find notes by meaning, not just keywords. It uses vector embeddings to understand the semantic similarity between your query and your notes.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/basicmachines-co/basic-memory/llms.txt
Use this file to discover all available pages before exploring further.
Semantic search is optional and disabled by default. It requires additional dependencies and resources.
How It Works
Configuration
Enable semantic search in your config:Embedding Providers
FastEmbed (Default)
FastEmbed provides local embedding models via ONNX runtime. Advantages:- Runs locally (no API calls)
- Fast inference
- Low memory footprint
- No API costs
| Model | Dimensions | Description |
|---|---|---|
BAAI/bge-small-en-v1.5 | 384 | Default - Balanced speed and quality |
BAAI/bge-base-en-v1.5 | 768 | Higher quality, slower |
sentence-transformers/all-MiniLM-L6-v2 | 384 | Fast, good for short texts |
OpenAI
Use OpenAI’s embedding API for higher quality embeddings. Advantages:- State-of-the-art quality
- No local compute required
- Latest models
- Requires API key and internet
- Per-token costs
- Data sent to OpenAI
| Model | Dimensions | Cost (per 1M tokens) |
|---|---|---|
text-embedding-3-small | 1536 | $0.02 |
text-embedding-3-large | 3072 | $0.13 |
text-embedding-ada-002 | 1536 | $0.10 (legacy) |
Database Setup
SQLite (sqlite-vec)
SQLite uses thesqlite-vec extension for vector storage:
PostgreSQL (pgvector)
PostgreSQL uses thepgvector extension:
Usage
MCP Tool
CLI
Hybrid Search
Combine full-text and semantic search:- Run full-text search (keyword matching)
- Run semantic search (meaning-based)
- Merge and re-rank results using Reciprocal Rank Fusion (RRF)
Performance Considerations
Embedding Generation
- FastEmbed (Local)
- OpenAI API
Speed: ~100-500 docs/sec (depending on hardware)Memory: ~200-500 MB for modelBatch processing: Enabled by default (
batch_size=32)Vector Search Performance
Vector search is slower than full-text search. Use hybrid search to get the best of both.
| Database | Backend | Search Time (10k docs) |
|---|---|---|
| SQLite | sqlite-vec | ~20-50ms |
| PostgreSQL | pgvector (IVFFlat) | ~10-30ms |
| PostgreSQL | pgvector (HNSW) | ~5-15ms |
Optimization Tips
Use smaller embedding models
Use smaller embedding models
Smaller dimensions = faster search:
- 384 dimensions: Faster, good for most use cases
- 768 dimensions: Balanced
- 1536+ dimensions: Higher quality, slower
Batch embedding generation
Batch embedding generation
Use PostgreSQL for large datasets
Use PostgreSQL for large datasets
PostgreSQL with pgvector is much faster for > 10k documents
Enable HNSW index (PostgreSQL)
Enable HNSW index (PostgreSQL)
Reindexing
Regenerate embeddings after changing models:Search Quality
When to Use Semantic Search
✅ Good for:- Finding conceptually similar notes
- Queries with synonyms or paraphrasing
- Discovering related topics
- Cross-lingual search (with multilingual models)
- Exact keyword matching
- Searching for specific names or IDs
- Boolean logic (AND, OR, NOT)
- Very short queries (< 3 words)
Example Comparisons
- Full-Text Search
- Semantic Search
- Hybrid Search
Query:
python loggingFinds: Documents containing “python” AND “logging”Misses: Documents about “debugging in Python” or “error handling”Multilingual Search
Use multilingual models for cross-language search:Privacy Considerations
For maximum privacy, use FastEmbed with local models.Troubleshooting
ModuleNotFoundError: No module named 'fastembed'
ModuleNotFoundError: No module named 'fastembed'
Solution: Install semantic dependencies:
PostgreSQL: extension 'vector' not found
PostgreSQL: extension 'vector' not found
Solution: Install pgvector:
Semantic search returns no results
Semantic search returns no results
Causes:
- Embeddings not generated yet
- Model mismatch (changed model without reindexing)
OpenAI API key not found
OpenAI API key not found
Solution: Set environment variable:
Next Steps
Search Guide
Learn advanced search techniques
Database Backends
Configure SQLite or PostgreSQL