Introduction to SNPannotator package

Package functions:

Please refer to the package manual for detailed information and examples.

demo_annotation()

This function provides a quick way to test the package by generating a sample output for one variant. Report files are saved in the current working directory.

Example:


library(SNPannotator)

demo_annotation()

getConfigFile()

Copy a sample configuration file to specified folder.

Function parameters:

dir.path, The existing folder for copying the file.

Example:


library(SNPannotator)

getConfigFile('/home/user/config.ini')

run_annotation()

This is the main package function, which receives the path to a configuration file (.ini) for running the parameter.

Function parameters:

configurationFilePath, The path to the configuration file.
verbose, Whether to display messages in the console, default=TRUE.

Example:

run_annotation('/home/user/analysis/config.ini')

findProxy()

this function can be used to find variants that are in high LD with a list of selected variants.

Function parameters:

rslist, A vector of rs numbers.
file, Path to the Excel file for saving search results. This is optional, the result will be just available in R environment if file path is not provided.
build, Genome build. Either 37 or 38. default: 37
db, The population database for calculating LD scores.
window_size, Number of base pairs around the variant for checking LD scores (max = 500kb).
r2, The minimum LD threshold for selecting variants around the target SNP. default: 0.8.

Example:

findProxy("rs234")

findProxy(c("rs234","rs678"))

findPairwiseLD()

This function computes the linkage disequilibrium (LD) between the selected variants using data from the Ensembl website.

Function parameters

rslist, A vector of rs numbers.
file, Path to the Excel file for saving search results. This is optional, the result will be just available in R environment if file path is not provided.
pairwise, If TRUE, compute pairwise LD between all elements of a list. If FALSE, computes the LD between first and other elements of the list. default: FALSE
build, Genome build. Either 37 or 38. default: 37
db, The population database for calculating LD scores.
window_size, Number of base pairs around the variant for checking LD scores (max = 500kb).
r2, The minimum LD threshold for selecting variants around the target SNP. default: 0.8.

Example:

output = findPairwiseLD(c('rs234','rs10244533'))

findGenomicPos()

This function retrieves variant information from the GTEx portal using either an rsID or a variant ID formatted as CHR_POS_REF_ALT. If an rsID is provided, the function returns the corresponding genomic positions in both GRCh37 and GRCh38 builds. When searching for an rsID based on genomic position, the position parameter should be specified according to the GRCh38 reference genome.

Function parameters

id, Character string representing the rsID (e.g., “rs12345”) or the variant ID
type, Character string specifying the type of query. Must be either “rsid” or “varid”.
file_path, path to a file for saving results as Excell spreadsheet.

Example:

findGenomicPos('rs121')

findGenomicPos('7_24898067_A_G', type = "varid", file_path  = "output.xlsx")

findRSID()

This function retrieves variant information from Ensembl based on the specified genomic position. It takes the chromosome number, start position, and end position as input parameters and searches for variants within this window, using the specified genomic build. If only the start position is provided, the function automatically sets the end position equal to the start position. This is particularly relevant for SNP variants, where the start and end positions are the same. The function returns all variants found within the defined window.

Function parameters

chromosome, Numeric, specifying the chromosome number.
start_position, Numeric, specifying the starting base pair position.
end_position, Numeric, specifying the ending base pair position.
build, Numeric, specifying the genomic build, default value is 38.
file_path, Character, path to a file for saving results as Excell spreadsheet.

Example:


findRSID(15, 79845218)

findRSID(15, 79845218 , 79845238, file_path = "output.xlsx")

findRSID(15, 80137560 ,build= "37")

stringdb_annotation()

This function takes a vector of gene symbols, retrieves their interaction partners from STRING DB, and performs functional enrichment analysis.

Function parameters

name A character string specifying a unique identifier for this analysis run.
gene_list A character vector of gene symbols (e.g., HGNC symbols or Ensembl gene IDs).
required_score Threshold of significance to include an interaction, a number between 0 and 1000.
limit Limits the number of interaction partners retrieved per protein, a number between 0 and 100.
… Additional arguments passed to downstream functions for extended customization.

Example:

stringdb_annotation('BC_project',c('BRCA1','BRCA2'),limit=10)

Introduction to SNPannotator package

Investigating the Functional Characteristics of Selected SNPs and Their Vicinity Genomic Region

Overview

Installation

Input Preparation

Step by step guide to running the package

Annotation steps of the top GWAS variants

Package functions:

demo_annotation()

getConfigFile()

run_annotation()

findProxy()

findPairwiseLD()

findGenomicPos()

findRSID()

stringdb_annotation()