๐ฆChromadb
Chromadb
This database can give you a quick headstart with the persist option. If you dont specify any arguments a default persistent storage will be used.
Supported Arguments:
host: Optional[str] = None
port: Optional[int] = None
persist_path: Optional[str] = None
search_method: Optional[SearchMethod] = SearchMethod.SIMILARITY_SEARCH
search_options: Optional[dict] = Field(default_factory=dict)
Supported Search Methods:
similarity_search
Search Options:
k : The top k elements for searching
max_marginal_relevance_search
Search Options
k: Number of Documents to return. Defaults to 4.
fetch_k: Number of Documents to fetch to pass to MMR algorithm.
lambda_mult: Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.
Usage
A Vectordb definitely needs a embedding function and you connect these two components through a stack.
from langchain.docstore.document import Document as LangDocument
from genai_stack.vectordb.chromadb import ChromaDB
from genai_stack.vectordb.weaviate_db import Weaviate
from genai_stack.embedding.utils import get_default_embedding
from genai_stack.stack.stack import Stack
embedding = get_default_embedding()
# Will use default persistent settings for a quick start
chromadb = ChromaDB.from_kwargs()
chroma_stack = Stack(model=None, embedding=embedding, vectordb=chromadb)
# Add your documents
chroma_stack.vectordb.add_documents(
documents=[
LangDocument(
page_content="Some page content explaining something", metadata={"some_metadata": "some_metadata"}
)
]
)
# Search for content in your vectordb
chroma_stack.vectordb.search("page")
You can also use different search_methods and search options when trying out more complicated usecases
chromadb = ChromaDB.from_kwargs(
search_method="max_marginal_relevance_search",
search_options={"k": 2, "fetch_k": 10, "lambda_mult": 0.3}
)
Last updated