This function performs PCA on a given dataset and visualizes the results using ggplot2. It allows users to specify genes of interest, customize scaling and centering, and color points based on a metadata variable.
Arguments
- data
A numeric matrix or data frame where rows represent genes and columns represent samples.
- metadata
A data frame containing sample metadata. The first column should contain sample names. Default is NULL.
- genes
A character vector specifying genes to be included in the PCA. Default is NULL (uses all genes).
- scale
Logical; if TRUE, variables are scaled before PCA. Default is FALSE.
- center
Logical; if TRUE, variables are centered before PCA. Default is TRUE.
- PCs
A list specifying which principal components (PCs) to plot. Default is
list(c(1,2))
.- ColorVariable
A character string specifying the metadata column used for coloring points. Default is NULL.
- ColorValues
A vector specifying custom colors for groups in
ColorVariable
. Default is NULL.- pointSize
Numeric; sets the size of points in the plot. Default is 5.
- legend_nrow
Integer; number of rows in the legend. Default is 2.
- legend_position
Character; position of the legend ("bottom", "top", "right", "left"). Default is "bottom".
- ncol
Integer; number of columns in the arranged PCA plots. Default is determined automatically.
- nrow
Integer; number of rows in the arranged PCA plots. Default is determined automatically.
Value
An invisible list containing:
plt
A ggplot2 or ggarrange object displaying the PCA plot.
data
A data frame containing PCA-transformed data and sample metadata (if not NULL).
Details
The function performs PCA using prcomp()
and visualizes the results using ggplot2
.
If a metadata data frame is provided, it ensures the sample order matches between data and metadata.
Examples
if (FALSE) { # \dontrun{
# Example dataset
set.seed(123)
data <- matrix(rnorm(1000), nrow=50, ncol=20)
colnames(data) <- paste0("Sample", 1:20)
rownames(data) <- paste0("Gene", 1:50)
metadata <- data.frame(Sample = colnames(data),
Group = rep(c("A", "B"), each = 10))
# Basic PCA plot
plotPCA(data, metadata, ColorVariable = "Group")
} # }