Skip to content
Snippets Groups Projects
01-theoryBehindHtrfit.Rmd 2.14 KiB
title: "Theory behind HTRfit"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Theory behind HTRfit}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "../man/figures/"
)

In the realm of RNA-seq analysis, various key experimental parameters play a crucial role in influencing the statistical power to detect expression changes. Parameters such as sequencing depth, the number of replicates, and others are expected to impact statistical power. To navigate the selection of optimal values for these experimental parameters, we introduce a comprehensive statistical framework known as HTRfit, underpinned by computational simulation. Moreover, HTRfit offers seamless compatibility with DESeq2 outputs, facilitating a comprehensive evaluation of RNA-seq analysis.

HTRfit simulation workflow

In this modeling framework, counts denoted as K_{ij} for gene i and sample j are generated using a negative binomial distribution. The negative binomial distribution considers a fitted mean \mu_{ij} and a gene-specific dispersion parameter dispersion_i. The fitted mean \mu_{ij} is determined by a parameter, q_{ij}, which is proportionally related to the sum of all effects specified using init_variable() or add_interaction(). If basal gene expressions are provided, the q_{ij} values are scaled accordingly using the gene-specific basal expression value (bexpr_i). Furthermore, the coefficients \beta_i represent the natural logarithm fold changes for gene i across each column of the model matrix X. The dispersion parameter dispersion_i plays a crucial role in defining the relationship between the variance of observed counts and their mean value. In simpler terms, it quantifies how far we expect observed counts to deviate from the mean value for each genes. In addition, HTRfit allows for sequencing depth control using a scalar value specific to each sample (s_j) applied on the \mu_{ij} value.