📦Chromadb
Chromadb
This database can give you a quick headstart with the persist option. If you dont specify any arguments a default persistent storage will be used.
Supported Arguments:
host: Optional[str] = None
port: Optional[int] = None
persist_path: Optional[str] = None
search_method: Optional[SearchMethod] = SearchMethod.SIMILARITY_SEARCH
search_options: Optional[dict] = Field(default_factory=dict)Supported Search Methods:
- similarity_search - Search Options: - k : The top k elements for searching 
 
 
- max_marginal_relevance_search - Search Options - k: Number of Documents to return. Defaults to 4. 
- fetch_k: Number of Documents to fetch to pass to MMR algorithm. 
- lambda_mult: Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5. 
 
 
Usage
A Vectordb definitely needs a embedding function and you connect these two components through a stack.
from langchain.docstore.document import Document as LangDocument
from genai_stack.vectordb.chromadb import ChromaDB
from genai_stack.vectordb.weaviate_db import Weaviate
from genai_stack.embedding.utils import get_default_embedding
from genai_stack.stack.stack import Stack
embedding = get_default_embedding()
# Will use default persistent settings for a quick start
chromadb = ChromaDB.from_kwargs()
chroma_stack = Stack(model=None, embedding=embedding, vectordb=chromadb)
# Add your documents
chroma_stack.vectordb.add_documents(
    documents=[
        LangDocument(
            page_content="Some page content explaining something", metadata={"some_metadata": "some_metadata"}
        )
    ]
)
        
# Search for content in your vectordb
chroma_stack.vectordb.search("page")You can also use different search_methods and search options when trying out more complicated usecases
chromadb = ChromaDB.from_kwargs(
    search_method="max_marginal_relevance_search", 
    search_options={"k": 2, "fetch_k": 10, "lambda_mult": 0.3}
)Last updated
