
RAGFlowChainR is an R package that brings Retrieval-Augmented Generation (RAG) capabilities to R, inspired by LangChain. It enables intelligent retrieval of documents from a local vector store (DuckDB), optional web search, and seamless integration with Large Language Models (LLMs).
Features include:
Python version: RAGFlowChain
(PyPI)
GitHub (Python): RAGFlowChain
install.packages("RAGFlowChainR")To get the latest features or bug fixes, you can install the
development version of RAGFlowChainR from GitHub:
# If needed
install.packages("remotes")
remotes::install_github("knowusuboaky/RAGFlowChainR")See the full function reference or the package website for more details.
Sys.setenv(TAVILY_API_KEY = "your-tavily-api-key")
Sys.setenv(OPENAI_API_KEY = "your-openai-api-key")
Sys.setenv(GROQ_API_KEY = "your-groq-api-key")
Sys.setenv(ANTHROPIC_API_KEY = "your-anthropic-api-key")To persist across sessions, add these to your
~/.Renviron file.
library(RAGFlowChainR)
local_files <- c("tests/testthat/test-data/sprint.pdf",
"tests/testthat/test-data/introduction.pptx",
"tests/testthat/test-data/overview.txt")
website_urls <- c("https://www.r-project.org")
crawl_depth <- 1
response <- fetch_data(
local_paths = local_files,
website_urls = website_urls,
crawl_depth = crawl_depth
)response
#> source title ...
#> 1 documents/sprint.pdf <NA> ...
#> 2 documents/introduction.pptx <NA> ...
#> 3 documents/overview.txt <NA> ...
#> 4 https://www.r-project.org R: The R Project for Statistical Computing ...
#> ...
cat(response$content[1])
#> Getting Started with Scrum\nCodeWithPraveen.com ...con <- create_vectorstore("tests/testthat/test-data/my_vectors.duckdb", overwrite = TRUE)
docs <- data.frame(head(response)) # reuse from fetch_data()
insert_vectors(
con = con,
df = docs,
embed_fun = embed_openai(),
chunk_chars = 12000
)
build_vector_index(con, type = c("vss", "fts"))
response <- search_vectors(con, query_text = "Tell me about R?", top_k = 5)response
#> id page_content dist
#> 1 5 [Home]\nDownload\nCRAN\nR Project...\n... 0.2183
#> 2 6 [Home]\nDownload\nCRAN\nR Project...\n... 0.2183
#> ...
cat(response$page_content[1])
#> [Home]\nDownload\nCRAN\nR Project\nAbout R\nLogo\n...rag_chain <- create_rag_chain(
llm = call_llm,
vector_database_directory = "tests/testthat/test-data/my_vectors.duckdb",
method = "DuckDB",
embedding_function = embed_openai(),
use_web_search = FALSE
)
response <- rag_chain$invoke("Tell me about R")response
#> $input
#> [1] "Tell me about R"
#>
#> $chat_history
#> [[1]] $role: "human", $content: "Tell me about R"
#> [[2]] $role: "assistant", $content: "R is a programming language..."
#>
#> $answer
#> [1] "R is a programming language and software environment commonly used for statistical computing and graphics..."
cat(response$answer)
#> R is a programming language and software environment commonly used for statistical computing and graphics...call_llm(
prompt = "Summarize the capital of France.",
provider = "groq",
model = "llama3-8b",
temperature = 0.7,
max_tokens = 200
)chatLLMThe chatLLM
package (now available on CRAN π) offers a modular interface for
interacting with LLM providers including OpenAI,
Groq, Anthropic,
DeepSeek, DashScope, and
GitHub Models.
install.packages("chatLLM")Features:
verbose = TRUE/FALSE)list_models()RAGFlowChainR.Renviron-based key management