--- title: "riemtan: Statistical Analysis of Connectomes using Riemannian Geometry" author: "Nicolas Escobar, Jaroslaw Harezlak" date: "2025-02-12" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{riemtan: Statistical Analysis of Connectomes using Riemannian Geometry} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction The `riemtan` package provides tools for statistical analysis of connectomes using Riemannian geometry. This package is particularly useful for researchers working with fMRI data and other applications where symmetric positive definite (SPD) matrices play a crucial role. ## Key Features - High-level interface for handling connectome data - Support for multiple Riemannian metrics (AIRM, Log-Euclidean, Euclidean, Log-Cholesky, Bures-Wasserstein) - Efficient parallel computation capabilities - Memory-efficient operations through reference-based manipulation - Comprehensive tools for geometric operations and statistical analysis # Installation You can install `riemtan` from CRAN: ```r install.packages("riemtan") ``` Or install the development version from GitHub: ```r devtools::install_github("nicoesve/riemtan") ``` # Basic Usage ## Loading the Package and Setting Up ```r library(riemtan) library(Matrix) library(future) # Enable parallel processing plan(multisession) ``` ## Working with Metrics The package provides several pre-configured metrics: ```r # Load the AIRM metric data(airm) # Other available metrics data(log_euclidean) data(euclidean) data(log_cholesky) data(bures_wasserstein) ``` ## Creating and Manipulating Samples ### Creating Random Samples ```r # Create an identity matrix id <- diag(10) |> as("dpoMatrix") |> Matrix::pack() # Generate two random samples sample1 <- rspdnorm(30, id, id, airm) # Centered at I sample2 <- rspdnorm(30, 2*id, id, airm) # Centered at 2I ``` ### Computing Different Representations ```r # Compute tangent space representations sample1$compute_unvecs() sample1$compute_tangents() # Compute manifold representations sample1$compute_conns() # Check if computations were successful !is.null(sample1$connectomes) # Should be TRUE ``` ## Statistical Analysis ### Computing the Fréchet Mean ```r # Create a sample from your connectome data conn_sample <- CSample$new(conns = your_connectomes, metric_obj = airm) # Compute the Fréchet mean conn_sample$compute_fmean() # Center the sample at the Fréchet mean conn_sample$center() ``` ### Computing Sample Statistics ```r # Compute variation conn_sample$compute_variation() # Compute sample covariance conn_sample$compute_sample_cov() ``` # Advanced Examples ## Discriminating Between Two Samples This example shows how to prepare data for clustering analysis: ```r # Combine two samples joint_conns <- c(sample1$connectomes, sample2$connectomes) combined_sample <- CSample$new(conns = joint_conns, metric_obj = airm) # Prepare for clustering combined_sample$compute_tangents() combined_sample$center() # Center at Fréchet mean combined_sample$compute_vecs() # Get vectorized representation # The vectorized representations can now be used with standard clustering algorithms ``` ## Working with Real Data When working with real connectome data, you'll typically need to pre-process your matrices: ```r # Convert your matrices to the correct format parsed_conns <- your_raw_conns |> purrr::map(\(x) as(x, "dpoMatrix")) |> purrr::map(Matrix::pack) # Create a CSample object conn_sample <- CSample$new(parsed_conns, airm) # Compute geometric statistics conn_sample$compute_fmean() conn_sample$compute_tangents() conn_sample$center() conn_sample$compute_vecs() ``` # Key Classes ## CSample Class The `CSample` class is the main workhorse of the package. It manages: - Connectome data in different representations (manifold, tangent space, vectorized) - Statistical computations - Automatic parallel processing when possible - Memory-efficient operations Key methods include: - `compute_tangents()`: Computes tangent space representations - `compute_conns()`: Computes manifold representations - `compute_vecs()`: Computes vectorized representations - `compute_fmean()`: Computes Fréchet mean - `center()`: Centers the sample at its Fréchet mean - `compute_variation()`: Computes sample variation - `compute_sample_cov()`: Computes sample covariance ## Metric Objects Metric objects (class `rmetric`) contain four essential functions: - `log`: Computes Riemannian logarithm - `exp`: Computes Riemannian exponential - `vec`: Performs vectorization - `unvec`: Performs inverse vectorization # Performance Considerations - The package automatically utilizes parallel processing for computationally intensive operations - Operations are performed by reference when possible to minimize memory usage - For large datasets, consider: - Using parallel processing (`plan(multisession)`) - Monitoring memory usage - Computing representations only as needed # References For more details about the mathematical foundations and methodology, refer to: 1. Pennec et al. - Introduction of the AIRM metric 2. Goni et al. - Applications in connectome analysis 3. Package documentation and vignettes