
Calculate Differential Gene Expression Statistics using limma
Source:R/calculateDE.R
calculateDE.Rd
This function computes differential gene expression statistics for each gene using a linear model via the limma package.
Users may supply a custom design matrix directly via the design
argument, or specify a model formula (lmexpression
)
(e.g., ~0 + X
or ~X
) or variables from metadata
to build the design matrix. When contrasts are supplied,
they are applied using limma::makeContrasts
and limma::contrasts.fit
. Alternatively, when using lmexpression
or a supplied
design
, specific coefficient indices may be provided via coefs
to extract the corresponding gene-level statistics.
Usage
calculateDE(
data,
metadata = NULL,
variables = NULL,
modelmat = NULL,
contrasts = NULL,
ignore_NAs = FALSE
)
Arguments
- data
A numeric matrix of gene expression values with genes as rows and samples as columns. Row names must correspond to gene identifiers. Data should not be transformed (i.e., not log2 transformed).
- metadata
A data frame containing sample metadata used to build the design matrix (unless a design is provided directly).
- variables
A character vector specifying the variable(s) from
metadata
to use in the default linear model. Ignored iflmexpression
ordesign
is provided.- modelmat
(Optional) A user-supplied design matrix. If provided, this design is used directly and
lmexpression
andvariables
are ignored. The order of samples in the design matrix should match the order in data.- contrasts
A character vector specifying contrasts to be applied (e.g.,
c("A-B")
). If multiple contrasts are provided, the function returns a list of DE results (one per contrast). Required iflmexpression
is NULL, optional otherwise. If not provided, the average expression profile of each condition will be returned instead of differential gene expression.- ignore_NAs
Boolean (default: FALSE). Whether to ignore NAs in the metadata. If TRUE, rows with any NAs will be removed before analysis, leading to a loss of data to be fitted in the model. Only applicable if
variables
is provided.
Details
The function fits a linear model with limma::lmFit
and applies empirical Bayes moderation with limma::eBayes
. Depending on the input:
If a design matrix is provided via
design
, that design is used directly.Otherwise, a design matrix is constructed using the
variables
argument (with no intercept).If contrasts are provided, they are applied using
limma::makeContrasts
andlimma::contrasts.fit
.If no contrasts are provided, the function returns all possible coefficients fitted in the linear model.
Examples
if (FALSE) { # \dontrun{
# Create example data:
data <- matrix(rnorm(1000), nrow = 100, ncol = 10)
rownames(data) <- paste0("gene", 1:100)
colnames(data) <- paste0("sample", 1:10)
metadata <- data.frame(sample = colnames(data), X = rep(c("A", "B"), each = 5))
# Example 1: Build design matrix from variables with a contrast:
de_res <- calculateDE(data, metadata, variables = "X", contrasts = "A-B")
# Example 2: Supply a custom design matrix directly:
design <- model.matrix(~0 + X, data = metadata)
de_res3 <- calculateDE(data, metadata, variables = "X", design = design, contrasts = "A-B")
} # }