This function performs GSEA using fgsea
for each contrast in a list of differential expression results.
It automatically determines the appropriate ranking statistic based on the gene set format unless specified by the user.
Arguments
- DEGList
A named list where each element represents a contrast and contains a data frame of differential expression results.
Each data frame must include at least the
"t"
statistic and the"B"
statistic for each gene.Row names should correspond to gene identifiers.
- gene_sets
A named list where each element represents a gene set. Each gene set can be:
A vector of gene names (for unidirectional gene sets).
A data frame with two columns:
Column 1: Gene names.
Column 2: Expected direction (
1
for upregulated genes,-1
for downregulated genes).
- stat
Optional. The statistic to use for ranking genes before GSEA. If
NULL
, it is automatically determined based on the gene set:"B"
for gene sets with no known direction (vectors)."t"
for unidirectional or bidirectional gene sets (data frames).If provided, this argument overrides the automatic selection.
- ContrastCorrection
Logical, default is
FALSE
. IfTRUE
, applies an additional multiple testing correction (Benjamini–Hochberg) across all contrasts returned in theDEGList
results list. This accounts for the number of contrasts tested per signature and provides more stringent control of false discovery rate across multiple comparisons. IfFALSE
, the function only corrects for the number of gene sets.- nPermSimple
Number of permutations in the simple fgsea implementation for preliminary estimation of P-values. Parameter from fgsea.
Value
A named list where each element corresponds to a contrast. Each contrast contains a single data frame with GSEA results for all gene sets.
P-values are corrected for multiple testing based on all contrasts.
The result includes the standard fgsea
output plus two additional columns:
pathway
: The name of the gene set.stat_used
: The statistic used for ranking genes in that analysis ("t"
or"B"
).
Examples
# Example input data
DEGList <- list(
Contrast1 = data.frame(t = rnorm(100), B = rnorm(100), row.names = paste0("Gene", 1:100)),
Contrast2 = data.frame(t = rnorm(100), B = rnorm(100), row.names = paste0("Gene", 1:100))
)
gene_sets <- list(
UnidirectionalSet = c("Gene1", "Gene5", "Gene20"),
BidirectionalSet = data.frame(Gene = c("Gene2", "Gene10", "Gene15"), Direction = c(1, -1, 1))
)
results <- runGSEA(DEGList, gene_sets)