--- title: "Introduction to rchroma" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to rchroma} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction The rchroma package provides an R interface to ChromaDB, a vector database for storing and querying embeddings. This vignette demonstrates the basic usage of the package. ## Installation You can install the development version of rchroma from GitHub: ```{r eval=FALSE} # install.packages("remotes") remotes::install_github("cynkra/rchroma") ``` ## Installing ChromaDB Before using rchroma, you need to have a running ChromaDB instance. The easiest way to get started is using Docker: ```bash docker pull chromadb/chroma docker run -p 8000:8000 chromadb/chroma ``` This will start a ChromaDB server on `http://localhost:8000`. For other installation methods and configuration options, please refer to the [ChromaDB documentation](https://docs.trychroma.com/docs/overview/introduction). ## Basic Usage ### Connecting to ChromaDB First, we need to establish a connection to ChromaDB: ```{r eval=FALSE} library(rchroma) # Connect to a local ChromaDB instance client <- chroma_connect() # Check the connection heartbeat(client) version(client) ``` ### Managing Collections Collections are the main way to organize your data in ChromaDB: ```{r eval=FALSE} # Create a new collection create_collection(client, "my_collection") # List all collections list_collections(client) # Get a specific collection get_collection(client, "my_collection") ``` ### Working with Documents Documents are the basic unit of data in ChromaDB. Each document consists of text content and its associated embedding: ```{r eval=FALSE} # Add documents with embeddings docs <- c( "apple fruit", "banana fruit", "carrot vegetable" ) embeddings <- list( c(1.0, 0.0, 0.0), # apple c(0.8, 0.2, 0.0), # banana (similar to apple) c(0.0, 0.0, 1.0) # carrot (different) ) # Add documents to the collection add_documents( client, "my_collection", documents = docs, ids = c("doc1", "doc2", "doc3"), embeddings = embeddings ) # Query similar documents using embeddings results <- query( client, "my_collection", query_embeddings = list(c(1.0, 0.0, 0.0)), # should match apple best n_results = 2 ) ``` ### Updating and Deleting You can update or delete documents as needed: ```{r eval=FALSE} # Update embedding separately update_documents( client, "my_collection", ids = "doc1", embeddings = list(c(0.9, 0.1, 0.0)) # slightly different from original apple ) # Delete documents delete_documents(client, "my_collection", ids = "doc2") # removes banana ```