Skip to contents

This function computes differential gene expression statistics for each gene using a linear model via the limma package. Users may supply a custom design matrix directly via the design argument, or specify a model formula (lmexpression) (e.g., ~0 + X or ~X) or variables from metadata to build the design matrix. When contrasts are supplied, they are applied using limma::makeContrasts and limma::contrasts.fit. Alternatively, when using lmexpression or a supplied design, specific coefficient indices may be provided via coefs to extract the corresponding gene-level statistics.

Usage

calculateDE(
  data,
  metadata = NULL,
  variables = NULL,
  modelmat = NULL,
  contrasts = NULL,
  ignore_NAs = FALSE
)

Arguments

data

A numeric matrix of gene expression values with genes as rows and samples as columns. Row names must correspond to gene identifiers. Data should not be transformed (i.e., not log2 transformed).

metadata

A data frame containing sample metadata used to build the design matrix (unless a design is provided directly).

variables

A character vector specifying the variable(s) from metadata to use in the default linear model. Ignored if lmexpression or design is provided.

modelmat

(Optional) A user-supplied design matrix. If provided, this design is used directly and lmexpression and variables are ignored. The order of samples in the design matrix should match the order in data.

contrasts

A character vector specifying contrasts to be applied (e.g., c("A-B")). If multiple contrasts are provided, the function returns a list of DE results (one per contrast). Required if lmexpression is NULL, optional otherwise. If not provided, the average expression profile of each condition will be returned instead of differential gene expression.

ignore_NAs

Boolean (default: FALSE). Whether to ignore NAs in the metadata. If TRUE, rows with any NAs will be removed before analysis, leading to a loss of data to be fitted in the model. Only applicable if variables is provided.

Value

A list of data-frames of differential expression statistics

Details

The function fits a linear model with limma::lmFit and applies empirical Bayes moderation with limma::eBayes. Depending on the input:

  • If a design matrix is provided via design, that design is used directly.

  • Otherwise, a design matrix is constructed using the variables argument (with no intercept).

  • If contrasts are provided, they are applied using limma::makeContrasts and limma::contrasts.fit.

  • If no contrasts are provided, the function returns all possible coefficients fitted in the linear model.

Examples

if (FALSE) { # \dontrun{
  # Create example data:
  data <- matrix(rnorm(1000), nrow = 100, ncol = 10)
  rownames(data) <- paste0("gene", 1:100)
  colnames(data) <- paste0("sample", 1:10)
  metadata <- data.frame(sample = colnames(data), X = rep(c("A", "B"), each = 5))

  # Example 1: Build design matrix from variables with a contrast:
  de_res <- calculateDE(data, metadata, variables = "X", contrasts = "A-B")

  # Example 2: Supply a custom design matrix directly:
  design <- model.matrix(~0 + X, data = metadata)
  de_res3 <- calculateDE(data, metadata, variables = "X", design = design, contrasts = "A-B")
} # }