Skip to contents

Visualizes similarity between user-defined gene signatures and either other user-defined signatures or MSigDB gene sets, using either the Jaccard index or Fisher's Odds Ratio. Produces a heatmap of pairwise similarity metrics.

Usage

signature_similarity(
  signatures,
  other_user_signatures = NULL,
  collection = NULL,
  subcollection = NULL,
  metric = c("jaccard", "odds_ratio"),
  universe = NULL,
  or_threshold = 1,
  pval_threshold = 0.05,
  limits = NULL,
  title_size = 12,
  color_values = c("#F9F4AE", "#B44141"),
  title = NULL,
  num_sigs_toplot = NULL,
  jaccard_threshold = 0,
  msig_subset = NULL,
  width_text = 20,
  na_color = "grey90"
)

Arguments

signatures

A named list of character vectors representing reference gene signatures.

other_user_signatures

Optional. A named list of character vectors representing other user-defined signatures to compare against.

collection

Optional. MSigDB collection name (e.g., "H" for hallmark, "C2" for curated gene sets). Use msigdbr::msigdbr_collections() for the available options.

subcollection

Optional. Subcategory within an MSigDB collection (e.g., "CP:REACTOME"). Use msigdbr::msigdbr_collections() for the available options.

metric

Character. Either "jaccard" or "odds_ratio".

universe

Character vector. Background gene universe. Required for odds ratio.

or_threshold

Numeric. Minimum Odds Ratio required for a gene set to be included in the plot. Default is 1.

pval_threshold

Numeric. Maximum adjusted p-value to show a label. Default is 0.05.

limits

Numeric vector of length 2. Limits for color scale.

title_size

Integer specifying the font size for the plot title. Default is 12.

color_values

Character vector of colors used for the fill gradient. Default is c("#F9F4AE", "#B44141").

title

Optional. Custom title for the plot. If NULL, the title defaults to "Signature Overlap".

num_sigs_toplot

Optional. Integer. Maximum number of comparison signatures (including user and MSigDB) to display.

jaccard_threshold

Numeric. Minimum Jaccard index required for a gene set to be included in the plot. Default is 0.

msig_subset

Optional. Character vector of MSigDB gene set names to subset from the specified collection. Useful to restrict analysis to a specific set of pathways. If supplied, other filters will apply only to this subset.

width_text

Integer. Character wrap width for labels.

na_color

Character. Color for NA values in the heatmap. Default is "grey90".

Value

A ggplot heatmap object.

Examples

sig1 <- list(A = c("TP53", "BRCA1", "EGFR"))
sig2 <- list(B = c("TP53", "MYC", "EGFR"), C = c("GATA3", "STAT3"))

signature_similarity(
  signatures = sig1,
  other_user_signatures = sig2
)
#> Error in signature_similarity(signatures = sig1, other_user_signatures = sig2): object 'or_label_threshold' not found