README.md

# High-Throughput RNA-seq model fit

## Why use HTRfit

HTRfit provides a robust statistical framework that allows you to investigate the essential experimental parameters influencing your ability to detect expression changes. Whether you're examining sequencing depth, the number of replicates, or other critical factors, HTRfit's computational simulation is your go-to solution.

Furthermore, by enabling the inclusion of fixed effects, mixed effects, and interactions in your RNAseq data analysis, HTRfit provides the flexibility needed to lead your differential expression analysis effectively.


- [Installation](#installation)
- [CRAN packages dependencies](#cran-packages-dependencies)
- [Docker](#docker)
- [HTRfit simulation workflow](#htrfit-simulation-workflow)
- [Getting started](#getting-started)


## Installation

#### method A:  

To install the latest version of HTRfit, run the following in your R console :
```
if (!requireNamespace("remotes", quietly = TRUE))
    install.packages("remotes")
remotes::install_git("https://gitbio.ens-lyon.fr/aduvermy/HTRfit")
```

#### method B:

You also have the option to download a release directly from the [HTRfit release page](https://gitbio.ens-lyon.fr/aduvermy/HTRfit/-/releases). Once you've downloaded the release, simply untar the archive. After that, open your R console and execute the following command, where HTRfit-v1.0.0 should be replaced with the path to the untarred folder:

```
## -- Example using the HTRfit-v1.0.0 release
install.packages('/HTRfit-v1.0.0', repos = NULL, type='source')

```

When dependencies are met, installation should take a few minutes.


## CRAN packages dependencies

The following depandencies are required:

```
## -- required
install.packages(c('car', 'parallel', 'data.table', 'ggplot2', 'gridExtra', 'glmmTMB',
 'magrittr', 'MASS', 'plotROC', 'reshape2', 'rlang', 'stats', 'utils', 'BiocManager'))
BiocManager::install('S4Vectors', update = FALSE)
## -- optional 
BiocManager::install('DESeq2', update = FALSE)
```

## Docker

We have developed [Docker images](https://hub.docker.com/repository/docker/ruanad/htrfit/general) to simplify the package's utilization. For an optimal development and coding experience with the Docker container, we recommend using Visual Studio Code (VSCode) along with the DevContainer extension. This setup provides a convenient and isolated environment for development and testing.

1. Install VSCode.
2. Install Docker on your system and on VSCode.
3. Launch the HTRfit container directly from VSCode
4. Install the DevContainer extension for VSCode.
5. Launch a remote window connected to the running Docker container.
6. Install the R extension for VSCode.
7. Enjoy HTRfit !


## Biosphere virtual machine

A straightforward way to use **HTRfit** is to run it on a Virtual Machine (VM) through [Biosphere](https://biosphere.france-bioinformatique.fr/catalogue/). We recommend utilizing a VM that includes RStudio for an integrated development environment (IDE) experience. Biosphere VM resources can also be scaled according to your simulation needs.  
**HTRfit** can be installed using the [method A](#method-a).


## HTRfit simulation workflow

In the realm of RNAseq analysis, various key experimental parameters play a crucial role in influencing the statistical power to detect expression changes. Parameters such as sequencing depth, the number of replicates, and more have a significant impact. To navigate the selection of optimal values for these experimental parameters, we introduce a comprehensive statistical framework known as **HTRfit**, underpinned by computational simulation. Moreover, **HTRfit** offers seamless compatibility with DESeq2 outputs, facilitating a comprehensive evaluation of RNAseq analysis. 


<div id="bg"  align="center">
  <img src="./vignettes/figs/htrfit_workflow.png" width="500" height="300">
</div> 


## Getting started

```
library('HTRfit')
## -- init a design 
input_var_list <- init_variable( name = "varA", mu = 0, sd = 0.29, level = 60) %>%
                  init_variable( name = "varB", mu = 0.27, sd = 0.6, level = 2) %>%
                    add_interaction( between_var = c("varA", "varB"), mu = 0.44, sd = 0.89)
## -- simulate RNAseq data 
mock_data <- mock_rnaseq(input_var_list, 
                         n_genes = 30,
                         min_replicates  = 10,
                         max_replicates = 10, 
                         basal_expression = 5 )
## -- prepare data & fit a model with mixed effect
data2fit = prepareData2fit(countMatrix = mock_data$counts, 
                           metadata =  mock_data$metadata, 
                           normalization = F)
l_tmb <- fitModelParallel(formula = kij ~ varB + (varB | varA),
                          data = data2fit, 
                          group_by = "geneID",
                          family = glmmTMB::nbinom2(link = "log"), 
                          log_file = "log.txt",
                          n.cores = 1)
## -- evaluation
resSimu <- simulationReport(mock_data, 
                            list_tmb = l_tmb,
                            coeff_threshold = 0.27, 
                            alt_hypothesis = "greater")

```