Skip to contents

Visualizes similarity between user-defined gene signatures and either other user-defined signatures or MSigDB gene sets, using either the Jaccard index or Fisher's Odds Ratio. Produces a heatmap of pairwise similarity metrics.

Usage

geneset_similarity(
  signatures,
  other_user_signatures = NULL,
  collection = NULL,
  subcollection = NULL,
  metric = c("jaccard", "odds_ratio"),
  universe = NULL,
  or_threshold = 1,
  pval_threshold = 0.05,
  limits = NULL,
  title_size = 12,
  color_values = c("#F9F4AE", "#B44141"),
  title = NULL,
  jaccard_threshold = 0,
  msig_subset = NULL,
  width_text = 20,
  na_color = "grey90"
)

Arguments

signatures

A named list of character vectors representing reference gene signatures.

other_user_signatures

Optional. A named list of character vectors representing other user-defined signatures to compare against.

collection

Optional. MSigDB collection name (e.g., "H" for hallmark, "C2" for curated gene sets). Use msigdbr::msigdbr_collections() for the available options.

subcollection

Optional. Subcategory within an MSigDB collection (e.g., "CP:REACTOME"). Use msigdbr::msigdbr_collections() for the available options.

metric

Character. Either "jaccard" or "odds_ratio".

universe

Character vector. Background gene universe. Required for odds ratio.

or_threshold

(only if method == "odds_ratio" only) Numeric. Minimum Odds Ratio required for a gene set to be included in the plot. Default is 1.

pval_threshold

(only if method == "odds_ratio" only) Numeric. Maximum adjusted p-value to show a label. Default is 0.05.

limits

Numeric vector of length 2. Limits for color scale.

title_size

Integer specifying the font size for the plot title. Default is 12.

color_values

Character vector of colors used for the fill gradient. Default is c("#F9F4AE", "#B44141").

title

Optional. Custom title for the plot. If NULL, the title defaults to "Signature Overlap".

jaccard_threshold

(only if method == "jaccard" only) Numeric. Minimum Jaccard index required for a gene set to be included in the plot. Default is 0.

msig_subset

Optional. Character vector of MSigDB gene set names to subset from the specified collection. Useful to restrict analysis to a specific set of pathways. If supplied, other filters will apply only to this subset. Use "collection = "all" to mix gene sets from different collections.

width_text

Integer. Character wrap width for labels.

na_color

Character. Color for NA values in the heatmap. Default is "grey90".

Value

Invisibly returns a list containing:

plot

The ggplot2 object of the similarity heatmap.

data

The data frame object containing the similarity scores aper pair of gene sets.

Examples

# Create two simple gene signatures
sig1 <- c("TP53", "BRCA1", "MYC", "EGFR", "CDK2")
sig2 <- c("ATXN2", "FUS", "MTOR", "CASP3")
signatures <- list(SignatureA = sig1, SignatureB = sig2)

# Compare the signatures using the Jaccard index
plt <- geneset_similarity(
  signatures = signatures,
  metric = "jaccard",
  collection = "H",
  jaccard_threshold = 0.01
)

# Print the plot (will show a small heatmap)
print(plt)
#> $plot

#> 
#> $data
#>    Reference_Signature                  Compared_Signature       Score Label
#> 6           SignatureA             HALLMARK_APICAL_SURFACE 0.020833333  0.02
#> 7           SignatureA                  HALLMARK_APOPTOSIS 0.012195122  0.01
#> 9           SignatureA    HALLMARK_CHOLESTEROL_HOMEOSTASIS 0.000000000  0.00
#> 13          SignatureA                HALLMARK_E2F_TARGETS 0.014851485  0.01
#> 33          SignatureA             HALLMARK_MYC_TARGETS_V2 0.016129032  0.02
#> 40          SignatureA    HALLMARK_PI3K_AKT_MTOR_SIGNALING 0.018518519  0.02
#> 41          SignatureA          HALLMARK_PROTEIN_SECRETION 0.010000000  0.01
#> 49          SignatureA HALLMARK_WNT_BETA_CATENIN_SIGNALING 0.044444444  0.04
#> 56          SignatureB             HALLMARK_APICAL_SURFACE 0.000000000  0.00
#> 57          SignatureB                  HALLMARK_APOPTOSIS 0.006097561  0.01
#> 59          SignatureB    HALLMARK_CHOLESTEROL_HOMEOSTASIS 0.012987013  0.01
#> 63          SignatureB                HALLMARK_E2F_TARGETS 0.000000000  0.00
#> 83          SignatureB             HALLMARK_MYC_TARGETS_V2 0.000000000  0.00
#> 90          SignatureB    HALLMARK_PI3K_AKT_MTOR_SIGNALING 0.000000000  0.00
#> 91          SignatureB          HALLMARK_PROTEIN_SECRETION 0.000000000  0.00
#> 99          SignatureB HALLMARK_WNT_BETA_CATENIN_SIGNALING 0.000000000  0.00
#>    Pval
#> 6    NA
#> 7    NA
#> 9    NA
#> 13   NA
#> 33   NA
#> 40   NA
#> 41   NA
#> 49   NA
#> 56   NA
#> 57   NA
#> 59   NA
#> 63   NA
#> 83   NA
#> 90   NA
#> 91   NA
#> 99   NA
#> 


# Odds ratio example (requires universe)
gene_universe <- unique(c(
  sig1, sig2,
  msigdbr::msigdbr(species = "Homo sapiens", category = "C2")$gene_symbol
))
#> Warning: The `category` argument of `msigdbr()` is deprecated as of msigdbr 10.0.0.
#>  Please use the `collection` argument instead.

plt_or <- geneset_similarity(
  signatures = signatures,
  metric = "odds_ratio",
  universe = gene_universe,
  collection = "H"
)
print(plt_or)
#> $plot

#> 
#> $data
#>              Reference_Signature                  Compared_Signature    Score
#> odds ratio1           SignatureA        HALLMARK_ALLOGRAFT_REJECTION 1.868611
#> odds ratio4           SignatureA            HALLMARK_APICAL_JUNCTION 1.440616
#> odds ratio5           SignatureA             HALLMARK_APICAL_SURFACE 2.107832
#> odds ratio6           SignatureA                  HALLMARK_APOPTOSIS 1.964912
#> odds ratio8           SignatureA    HALLMARK_CHOLESTEROL_HOMEOSTASIS     -Inf
#> odds ratio10          SignatureA                 HALLMARK_COMPLEMENT     -Inf
#> odds ratio11          SignatureA                 HALLMARK_DNA_REPAIR 1.567364
#> odds ratio12          SignatureA                HALLMARK_E2F_TARGETS 2.222313
#> odds ratio14          SignatureA    HALLMARK_ESTROGEN_RESPONSE_EARLY 1.440616
#> odds ratio17          SignatureA             HALLMARK_G2M_CHECKPOINT 1.440616
#> odds ratio18          SignatureA                 HALLMARK_GLYCOLYSIS 1.440616
#> odds ratio21          SignatureA                    HALLMARK_HYPOXIA 1.440616
#> odds ratio22          SignatureA        HALLMARK_IL2_STAT5_SIGNALING 1.442817
#> odds ratio24          SignatureA      HALLMARK_INFLAMMATORY_RESPONSE 1.440616
#> odds ratio26          SignatureA  HALLMARK_INTERFERON_GAMMA_RESPONSE     -Inf
#> odds ratio31          SignatureA             HALLMARK_MYC_TARGETS_V1 1.868611
#> odds ratio32          SignatureA             HALLMARK_MYC_TARGETS_V2 1.985916
#> odds ratio36          SignatureA                HALLMARK_P53_PATHWAY 1.440616
#> odds ratio39          SignatureA    HALLMARK_PI3K_AKT_MTOR_SIGNALING 2.155036
#> odds ratio40          SignatureA          HALLMARK_PROTEIN_SECRETION 1.763295
#> odds ratio42          SignatureA            HALLMARK_SPERMATOGENESIS     -Inf
#> odds ratio44          SignatureA    HALLMARK_TNFA_SIGNALING_VIA_NFKB 1.440616
#> odds ratio45          SignatureA  HALLMARK_UNFOLDED_PROTEIN_RESPONSE     -Inf
#> odds ratio46          SignatureA             HALLMARK_UV_RESPONSE_DN 1.585048
#> odds ratio47          SignatureA             HALLMARK_UV_RESPONSE_UP 1.544536
#> odds ratio48          SignatureA HALLMARK_WNT_BETA_CATENIN_SIGNALING 2.560755
#> odds ratio51          SignatureB        HALLMARK_ALLOGRAFT_REJECTION     -Inf
#> odds ratio54          SignatureB            HALLMARK_APICAL_JUNCTION     -Inf
#> odds ratio55          SignatureB             HALLMARK_APICAL_SURFACE     -Inf
#> odds ratio56          SignatureB                  HALLMARK_APOPTOSIS 1.661016
#> odds ratio58          SignatureB    HALLMARK_CHOLESTEROL_HOMEOSTASIS 2.001916
#> odds ratio60          SignatureB                 HALLMARK_COMPLEMENT 1.565718
#> odds ratio61          SignatureB                 HALLMARK_DNA_REPAIR     -Inf
#> odds ratio62          SignatureB                HALLMARK_E2F_TARGETS     -Inf
#> odds ratio64          SignatureB    HALLMARK_ESTROGEN_RESPONSE_EARLY     -Inf
#> odds ratio67          SignatureB             HALLMARK_G2M_CHECKPOINT     -Inf
#> odds ratio68          SignatureB                 HALLMARK_GLYCOLYSIS     -Inf
#> odds ratio71          SignatureB                    HALLMARK_HYPOXIA     -Inf
#> odds ratio72          SignatureB        HALLMARK_IL2_STAT5_SIGNALING 1.567921
#> odds ratio74          SignatureB      HALLMARK_INFLAMMATORY_RESPONSE     -Inf
#> odds ratio76          SignatureB  HALLMARK_INTERFERON_GAMMA_RESPONSE 1.565718
#> odds ratio81          SignatureB             HALLMARK_MYC_TARGETS_V1     -Inf
#> odds ratio82          SignatureB             HALLMARK_MYC_TARGETS_V2     -Inf
#> odds ratio86          SignatureB                HALLMARK_P53_PATHWAY     -Inf
#> odds ratio89          SignatureB    HALLMARK_PI3K_AKT_MTOR_SIGNALING     -Inf
#> odds ratio90          SignatureB          HALLMARK_PROTEIN_SECRETION     -Inf
#> odds ratio92          SignatureB            HALLMARK_SPERMATOGENESIS 1.738046
#> odds ratio94          SignatureB    HALLMARK_TNFA_SIGNALING_VIA_NFKB     -Inf
#> odds ratio95          SignatureB  HALLMARK_UNFOLDED_PROTEIN_RESPONSE 1.816713
#> odds ratio96          SignatureB             HALLMARK_UV_RESPONSE_DN     -Inf
#> odds ratio97          SignatureB             HALLMARK_UV_RESPONSE_UP 1.669220
#> odds ratio98          SignatureB HALLMARK_WNT_BETA_CATENIN_SIGNALING     -Inf
#>              Label         Pval
#> odds ratio1    1.9 7.938562e-04
#> odds ratio4    1.4 4.426028e-02
#> odds ratio5    2.1 9.875136e-03
#> odds ratio6    2.0 5.156291e-04
#> odds ratio8        1.000000e+00
#> odds ratio10       1.000000e+00
#> odds ratio11   1.6 3.334514e-02
#> odds ratio12   2.2 7.115670e-06
#> odds ratio14   1.4 4.426028e-02
#> odds ratio17   1.4 4.426028e-02
#> odds ratio18   1.4 4.426028e-02
#> odds ratio21   1.4 4.426028e-02
#> odds ratio22   1.4 4.404295e-02
#> odds ratio24   1.4 4.426028e-02
#> odds ratio26       1.000000e+00
#> odds ratio31   1.9 7.938562e-04
#> odds ratio32   2.0 1.300081e-02
#> odds ratio36   1.4 4.426028e-02
#> odds ratio39   2.2 2.196912e-04
#> odds ratio40   1.8 2.144501e-02
#> odds ratio42       1.000000e+00
#> odds ratio44   1.4 4.426028e-02
#> odds ratio45       1.000000e+00
#> odds ratio46   1.6 3.202865e-02
#> odds ratio47   1.5 3.509823e-02
#> odds ratio48   2.6 3.484122e-05
#> odds ratio51       1.000000e+00
#> odds ratio54       1.000000e+00
#> odds ratio55       1.000000e+00
#> odds ratio56   1.7 2.870711e-02
#> odds ratio58   2.0 1.327247e-02
#> odds ratio60   1.6 3.556699e-02
#> odds ratio61       1.000000e+00
#> odds ratio62       1.000000e+00
#> odds ratio64       1.000000e+00
#> odds ratio67       1.000000e+00
#> odds ratio68       1.000000e+00
#> odds ratio71       1.000000e+00
#> odds ratio72   1.6 3.539156e-02
#> odds ratio74       1.000000e+00
#> odds ratio76   1.6 3.556699e-02
#> odds ratio81       1.000000e+00
#> odds ratio82       1.000000e+00
#> odds ratio86       1.000000e+00
#> odds ratio89       1.000000e+00
#> odds ratio90       1.000000e+00
#> odds ratio92   1.7 2.411357e-02
#> odds ratio94       1.000000e+00
#> odds ratio95   1.8 2.021402e-02
#> odds ratio96       1.000000e+00
#> odds ratio97   1.7 2.817792e-02
#> odds ratio98       1.000000e+00
#>