introduction

Load your single cell object(s)

AnanseSeurat requires a preprocessed single cell object. Before exporting data make sure the single cell object has been filtered for good quality cells, and has underwent sufficient pre-processing. For guides on pre-processing scRNA-seq and scATAC-seq there are some great vignettes and guides available. We will also set the working directory.

# if (!requireNamespace("remotes", quietly = TRUE)) {
#   install.packages("remotes")
# }
# Sys.unsetenv("GITHUB_PAT")
# remotes::install_github("JGASmits/AnanseSeurat")

library(AnanseSeurat)
library(Seurat)
library(Signac)

In this example we will use a preprocsed 10x PBMC multiome dataset (PBMC from a Healthy Donor (v1, 150x150) Single Cell Multiome ATAC + Gene Expression Dataset by Cell Ranger ARC 2.0.0). This data was pre-processed following the standard pre-processing from Signac. The pre-pocessed single cell object is available at Zenado: https://zenodo.org/record/7446267/

rds_file <- 'preprocessed_PDMC.Rds'
pbmc <- readRDS(rds_file)
DimPlot(pbmc,
        label = TRUE,
        repel = TRUE,
        reduction = "umap") + NoLegend()

This single cell object contains multimodal data, both RNA and ATAC signal from each cell. However in the case of a separate scRNA-seq object and scATAC-seq object, AnanseSeurat can still prepare files for running single cell Ananse. In the case of two seperate objects as input it is important that both the objects share their respective cluster names.

Output files using AnanseSeurat

We will start with outputting the CPM file, for this we select:

  1. the minimum amount of cells a cluster needs to have to be included via min_cells
  2. the output directory
  3. the cluster ID containing the cluster names
  4. the name of the Assay containing the RNA data.
export_CPM_scANANSE(
  pbmc,
  min_cells <- 25,
  output_dir = paste0(tempdir(),'/analysis'),
  cluster_id = 'predicted.id',
  RNA_count_assay = 'RNA'
)

Next we will output the ATAC peak matrix, , for this we select:

  1. the minimum amount of cells a cluster needs to have to be included via min_cells
  2. the output directory
  3. the cluster ID containing the cluster names
  4. the name of the Assay containing the ATAC peak data.
export_ATAC_scANANSE(
  pbmc,
  min_cells <- 25,
  output_dir = paste0(tempdir(),'/analysis'),
  cluster_id = 'predicted.id',
  ATAC_peak_assay = 'peaks'
)

Next we will generate the config and sample file needed for anansnake, we will also specify specific pairwise comparisons between clusters of interest. By default, scANANSE compares all clusters to a network based on the average values of all clusters. Additional comparisons can be specified, in this case, B-naive and B-memory cells were also specified to compare directly to each other.

contrasts <-  list('B-naive_B-memory',
                   'B-memory_B-naive',
                   'B-naive_CD16-Mono',
                   'CD16-Mono_B-naive')

config_scANANSE(
  pbmc,
  min_cells <- 25,
  output_dir = paste0(tempdir(),'/analysis'),
  cluster_id = 'predicted.id',
  genome = './data/hg38',
  additional_contrasts = contrasts
)

Finally we will calculate the markergenes for each cluster and between clusters of a specific comparison:

DEGS_scANANSE(
  pbmc,
  min_cells <- 25,
  output_dir = './analysis',
  cluster_id = 'predicted.id',
  additional_contrasts = contrasts
)

run Anansnake on the generated files

After this all your files are ready to run Anansnake for gene regulatory network analysis. For info on installing and running anansnake see: https://github.com/vanheeringen-lab/anansnake

Import influence scores back into your single cell object

After running anansnake the results can be incorporated back into your single cell object.

pbmc <- import_seurat_scANANSE(pbmc,
                               cluster_id = 'predicted.id',
                               anansnake_inf_dir = "./analysis/influence/")

The top TFs contributing to your clusters can be vizualized as a table:

TF_influence <- per_cluster_df(pbmc,
                               assay = 'influence',
                               cluster_id = 'predicted.id')

head(TF_influence)

We can also visualize the influence score of TFs in the single cell object:

highlight_TF1 <- c('STAT4', 'MEF2C')

DefaultAssay(object = pbmc) <- "RNA"
plot_expression1 <-
  FeaturePlot(pbmc, features = highlight_TF1, ncol = 1)
DefaultAssay(object = pbmc) <- "influence"
plot_ANANSE1 <-
  FeaturePlot(
    pbmc,
    ncol = 1,
    features = highlight_TF1,
    cols = c("darkgrey", "#fc8d59")
  )
print(plot_expression1 | plot_ANANSE1)