Skip to contents

This function simulates false positive rates (FPR) by generating simulated gene signatures and comparing the observed effect size values (Cohen's d or f) of the original signatures to those from simulated signatures. The effect size is computed using three scoring methods (ssGSEA, logmedian, and ranking), and the results are visualized as violin plots with overlaid observed values.

Usage

FPR_Simulation(
  data,
  metadata,
  original_signatures,
  Variable,
  gene_list = NULL,
  number_of_sims = 10,
  title = NULL,
  widthTitle = 30,
  titlesize = 12,
  pointSize = 2,
  labsize = 10,
  mode = c("none", "simple", "medium", "extensive"),
  ColorValues = NULL,
  ncol = NULL,
  nrow = NULL
)

Arguments

data

A data frame or matrix of gene expression values (genes as rows, samples as columns).

metadata

A data frame containing metadata for the samples (columns of data).

original_signatures

A named list of gene signatures. Each element can be either:

  • A vector of gene names (unidirectional), or

  • A data frame with columns "Gene" and "Signal" for bidirectional signatures.

Variable

A column in metadata indicating the variable of interest for grouping or regression. This can be categorical or numeric.

gene_list

A character vector of gene names from which simulated signatures are generated by sampling. Default is all genes in data.

number_of_sims

Integer. Number of simulated gene signatures to generate per original signature.

title

Optional title for the overall plot.

widthTitle

Integer. Max width for wrapping the title text (default: 30).

titlesize

Numeric. Font size for the title text (default: 12).

pointSize

Numeric. Size of the points representing simulations (default: 2).

labsize

Numeric. Font size for axis labels (default: 10).

ColorValues

Named vector of colors for plot points, typically Original and Simulated. If NULL, default colors are used.

ncol

Integer. Number of columns for arranging signature plots in a grid layout. If NULL, layout is auto-calculated.

nrow

Integer. Number of rows for arranging signature plots in a grid layout. If NULL, layout is auto-calculated.

modeA

string specifying the level of detail for contrasts. Options are:

  • "simple": Performs the minimal number of pairwise comparisons between individual group levels (e.g., A - B, A - C). Default.

  • "medium": Includes comparisons between one group and the union of all other groups (e.g., A - (B + C + D)), enabling broader contrasts beyond simple pairs.

  • "extensive": Allows for all possible algebraic combinations of group levels (e.g., (A + B) - (C + D)), supporting flexible and complex contrast definitions.

  • "none": Comparing all levels of Variable (default)

Value

Invisibly returns the combined ggplot object showing observed vs simulated effect sizes. One violin plot is generated per signature and contrast. Observed values are highlighted and compared to the simulated distribution. Significance (adjusted p-value ≤ 0.05) is indicated by point shape.

Details

The function supports both categorical and numeric variables:

  • For categorical variables, Cohen's d is used and contrasts are defined by the mode parameter, if mode!=none.

  • For numeric variables, Cohen's f is used to quantify associations through linear modeling.

For each original gene signature, a number of simulated signatures are created by sampling genes from gene_list. Each simulated signature is scored using three methods, and its effect size is computed relative to the variable of interest. The resulting distributions are shown as violins, overlaid with the observed value from the original signature. A red dashed line marks the 95th percentile of the simulated distribution per method.

The function internally uses CohenD_allConditions() and CohenF_allConditions() depending on variable type.

Examples

if (FALSE) { # \dontrun{
FPR_Simulation(
  data = expression_data,
  metadata = sample_metadata,
  original_signatures = my_signatures,
  Variable = "condition",
  number_of_sims = 100,
  title_for_plot = "Simulation for FPR"
)
} # }