Skip to content
Snippets Groups Projects
README.md 3.06 KiB
Newer Older
mcariou's avatar
mcariou committed
# Evolutionary history of SARS-CoV-2 interactome in bats and primates identifies key virus-host interfaces and conflicts

## Introduction

The current COVID-19 pandemic is caused by a novel coronavirus strain, SARS-CoV-2. It originated from the cross-species transmission of a coronavirus from the bat reservoir, directly or through an intermediate host to humans. This catastrophic spillover underlines the necessity to better understand how viruses and hosts have shaped one another over evolutionary time.

Pathogenic viruses put a selective pressure on the host-viral interacting proteins. Identifying which host genes bear signatures of such evolutionary conflict (e.g. positive selection) can lead to the identification of the proteins that have been the most relevant in the response to a virus family. Here, we have used this evolutionary framework to decipher which interactions between the SARS-CoV-2-like viruses and our cells have been important in vivo. In addition, identifying traces of positive selection in different hosts phylogenetic lineages also sheds lights on ancient epidemics and how virus-host determinants may be species specific. This may help to understand differences in susceptibility and pathogenicity to SARS-CoV-like viruses between hosts.

mcariou's avatar
mcariou committed
To achieve this, we characterized the evolutionary history of the SARS-CoV-2 interactome identified in in vitro studies: 332 host proteins identified by mass-spectrometry by Gordon and collaborators [1], as well as two essential SARS-CoV-2 entry factors, the angiotensin converting enzyme 2 (ACE2) and the transmembrane serine protease 2 (TMPRSS2) genes. We characterized their evolution in primates (tracing the human history) and in bats (the natural viral reservoir). To do so, we used [DGINN](https://academic.oup.com/nar/article/48/18/e103/5907962?login=true), a novel computational pipeline to Detect Genetic INNovations in protein-coding genes, which embeds gold-standard methods to perform phylogenetic and positive selection analyses in a high-throughput manner.
mcariou's avatar
mcariou committed

## Data formating

mcariou's avatar
mcariou committed
Requisite R packages: formatR, tinytex 

~
mcariou's avatar
mcariou committed

Script to merge DGINN outputs from different batch of analysis and included or correct rows corresponding to genes ran on corrected alignmenents.
```
mcariou's avatar
mcariou committed
rnw_scripts/covid_comp_script0_table.pdf
mcariou's avatar
mcariou committed
```
Input tables in **data/**.

mcariou's avatar
mcariou committed
Output tables in **out_tab/**
mcariou's avatar
mcariou committed

The tables output from this script will be used for the following analysis steps.

mcariou's avatar
mcariou committed
## Comparison between datasets primates and bats 

Requisite R packages: Mondrian, UpSetR, dendextend, ggraph, igraph, tidyverse,viridis.

~

Script to compare bats and primates screen.
```
rnw_scripts/covid_comp_dataset.pdf
```
Input tables in **out_tab/**.

mcariou's avatar
mcariou committed
Output tables in **figure/1_xxx**
mcariou's avatar
mcariou committed

mcariou's avatar
mcariou committed
## Comparaison with MAIC score and pancorona analysis
mcariou's avatar
mcariou committed

mcariou's avatar
mcariou committed
Script to compare the DGINN screen results to [MAIC](https://www.nature.com/articles/s41598-020-79033-3) score and [pancorona data](https://translational-medicine.biomedcentral.com/articles/10.1186/s12967-020-02480-z).
mcariou's avatar
mcariou committed

mcariou's avatar
mcariou committed
```
rnw_scripts/covid_comp_maic_pancorona.pdf
```
Input tables in **out_tab/**.
mcariou's avatar
mcariou committed

mcariou's avatar
mcariou committed
Output tables in **figure/2_xxx**