Newer
Older
# Evolutionary history of SARS-CoV-2 interactome in bats and primates identifies key virus-host interfaces and conflicts
## Introduction
The current COVID-19 pandemic is caused by a novel coronavirus strain, SARS-CoV-2. It originated from the cross-species transmission of a coronavirus from the bat reservoir, directly or through an intermediate host to humans. This catastrophic spillover underlines the necessity to better understand how viruses and hosts have shaped one another over evolutionary time.
Pathogenic viruses put a selective pressure on the host-viral interacting proteins. Identifying which host genes bear signatures of such evolutionary conflict (e.g. positive selection) can lead to the identification of the proteins that have been the most relevant in the response to a virus family. Here, we have used this evolutionary framework to decipher which interactions between the SARS-CoV-2-like viruses and our cells have been important in vivo. In addition, identifying traces of positive selection in different hosts phylogenetic lineages also sheds lights on ancient epidemics and how virus-host determinants may be species specific. This may help to understand differences in susceptibility and pathogenicity to SARS-CoV-like viruses between hosts.
To achieve this, we characterized the evolutionary history of the SARS-CoV-2 interactome identified in in vitro studies: 332 host proteins identified by mass-spectrometry by Gordon and collaborators [1], as well as two essential SARS-CoV-2 entry factors, the angiotensin converting enzyme 2 (ACE2) and the transmembrane serine protease 2 (TMPRSS2) genes. We characterized their evolution in primates (tracing the human history) and in bats (the natural viral reservoir). To do so, we used [DGINN](https://academic.oup.com/nar/article/48/18/e103/5907962?login=true), a novel computational pipeline to Detect Genetic INNovations in protein-coding genes, which embeds gold-standard methods to perform phylogenetic and positive selection analyses in a high-throughput manner.
Script to merge DGINN outputs from different batch of analysis and included or correct rows corresponding to genes ran on corrected alignmenents.
```
The tables output from this script will be used for the following analysis steps.
## Comparison between datasets primates and bats
Requisite R packages: Mondrian, UpSetR, dendextend, ggraph, igraph, tidyverse,viridis.
~
Script to compare bats and primates screen.
```
rnw_scripts/covid_comp_dataset.pdf
```
Input tables in **out_tab/**.
## Comparaison with MAIC score and pancorona analysis
Script to compare the DGINN screen results to [MAIC](https://www.nature.com/articles/s41598-020-79033-3) score and [pancorona data](https://translational-medicine.biomedcentral.com/articles/10.1186/s12967-020-02480-z).
```
rnw_scripts/covid_comp_maic_pancorona.pdf
```
Input tables in **out_tab/**.