Commit 4d71656b authored by mcariou's avatar mcariou
Browse files

write assembly

parent ba09c3de
......@@ -255,6 +255,7 @@ Origin<-rep("pneumo", 6)
tabAss<-rbind(tabAss, cbind(Taxonomy, Assembly, Origin))
write.table(tabAss, paste0(home, "/phylolegio/doc/tabAss.txt"), quote=FALSE, row.names=FALSE)
@
......
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex 2021.6.16) 11 OCT 2021 11:40
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=pdflatex 2021.6.16) 12 OCT 2021 15:14
entering extended mode
restricted \write18 enabled.
%&-line parsing enabled.
......@@ -1900,12 +1900,19 @@ enome : FQ958210.1
Package xcolor Warning: Incompatible color definition on input line 527.
Overfull \hbox (174.62796pt too wide) in paragraph at lines 545--545
[][]\OT1/cmtt/m/n/10.95 write.table[][](tabAss,[] []paste0[][](home,[] []"/phyl
olegio/doc/tabAss.txt"[][]),[] []quote[][]=[][]FALSE[][],[] []row.names[][]=[][
]FALSE[][])[][]
[]
[10]
Package xcolor Warning: Incompatible color definition on input line 554.
Package xcolor Warning: Incompatible color definition on input line 556.
Overfull \hbox (249.08722pt too wide) in paragraph at lines 556--556
Overfull \hbox (249.08722pt too wide) in paragraph at lines 558--558
[][]\OT1/cmtt/m/n/10.95 tab10[][]<-[][]read.table[][]([][]paste0[][](data, data
list1[[][]3[][]]),[] []skip[][]=[][]2[][],[] []sep[][]=[][]"\OMS/cmsy/m/n/10.95
n\OT1/cmtt/m/n/10.95 t"[][],[] []fill[][]=[][]TRUE[][],[] []header[][]=[][]TRU
......@@ -1913,52 +1920,53 @@ E[][],[] []comment.char[] []=[] []""[][])[][]
[]
Overfull \hbox (94.14633pt too wide) in paragraph at lines 559--559
Overfull \hbox (94.14633pt too wide) in paragraph at lines 561--561
[][]\OT1/cmtt/m/n/10.95 tab10[][]$[][]sp[][]<-[][]sapply[][]([][]as.character[]
[](tab10[][]$[][]ORF),[] []function[][]([][]x[][])[] []strsplit[][](x,[] []"_"[
][])[[[][]1[][]]][[][]1[][]])[][]
[]
Package xcolor Warning: Incompatible color definition on input line 571.
Package xcolor Warning: Incompatible color definition on input line 573.
Overfull \hbox (2.16733pt too wide) in paragraph at lines 573--573
Overfull \hbox (2.16733pt too wide) in paragraph at lines 575--575
[][]\OT1/cmtt/m/n/10.95 list.files[][]([][]path[] []=[] []paste0[][](home,[] []
"/genes/78genes/prot_pneumo78"[][]))[][]
[]
Overfull \hbox (186.12534pt too wide) in paragraph at lines 579--579
Overfull \hbox (186.12534pt too wide) in paragraph at lines 581--581
[]\OT1/cmtt/m/n/10.95 ## [2] "uniprot-lpg2337+OR+lpg2304+OR+lpg2302+OR+lpg1911+
OR+lpg2330+OR+lpg1767+OR+lpg231--.tab"[]
[]
Overfull \hbox (203.3714pt too wide) in paragraph at lines 579--579
Overfull \hbox (203.3714pt too wide) in paragraph at lines 581--581
[]\OT1/cmtt/m/n/10.95 ## [3] "uniprot-lpg2337+OR+lpg2304+OR+lpg2302+OR+lpg1911+
OR+lpg2330+OR+lpg1767+OR+lpg231--.tab.gz"[]
[]
Package atveryend Info: Empty hook `BeforeClearDocument' on input line 597.
[11]
Package atveryend Info: Empty hook `AfterLastShipout' on input line 597.
Package atveryend Info: Empty hook `BeforeClearDocument' on input line 613.
[12]
Package atveryend Info: Empty hook `AfterLastShipout' on input line 613.
(./1_reference_legio_phylo.aux)
Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 597.
Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 597.
Package atveryend Info: Executing hook `AtVeryEndDocument' on input line 613.
Package atveryend Info: Executing hook `AtEndAfterFileList' on input line 613.
Package rerunfilecheck Info: File `1_reference_legio_phylo.out' has not changed
.
(rerunfilecheck) Checksum: 2CD0A8DA6DC76FA5F58220EAE85F2EFB;998.
Package atveryend Info: Empty hook `AtVeryVeryEnd' on input line 597.
Package atveryend Info: Empty hook `AtVeryVeryEnd' on input line 613.
)
Here is how much of TeX's memory you used:
7819 strings out of 492982
111902 string characters out of 6134896
7820 strings out of 492982
111909 string characters out of 6134896
215483 words of memory out of 5000000
11275 multiletter control sequences out of 15000+600000
9834 words of font info for 35 fonts, out of 8000000 for 9000
1141 hyphenation exceptions out of 8191
28i,6n,35p,438b,373s stack positions out of 5000i,500n,10000p,200000b,80000s
28i,6n,35p,442b,378s stack positions out of 5000i,500n,10000p,200000b,80000s
{/usr/share/texmf/fonts/enc/dvips/cm-super/cm-super-ts1.enc}</usr/share/texli
ve/texmf-dist/fonts/type1/public/amsfonts/cm/cmbx10.pfb></usr/share/texlive/tex
mf-dist/fonts/type1/public/amsfonts/cm/cmbx12.pfb></usr/share/texlive/texmf-dis
......@@ -1969,10 +1977,10 @@ c/amsfonts/cm/cmr17.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfo
nts/cm/cmsy10.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm
/cmti10.pfb></usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmtt1
0.pfb></usr/share/texmf/fonts/type1/public/cm-super/sfrm1095.pfb>
Output written on 1_reference_legio_phylo.pdf (11 pages, 148543 bytes).
Output written on 1_reference_legio_phylo.pdf (12 pages, 153937 bytes).
PDF statistics:
178 PDF objects out of 1000 (max. 8388607)
153 compressed objects within 2 object streams
27 named destinations out of 1000 (max. 500000)
182 PDF objects out of 1000 (max. 8388607)
156 compressed objects within 2 object streams
28 named destinations out of 1000 (max. 500000)
113 words of extra memory for PDF output out of 10000 (max. 10000000)
......@@ -541,6 +541,8 @@ Manual research of reference genomes using gupta1
\hlstd{Origin}\hlkwb{<-}\hlkwd{rep}\hlstd{(}\hlstr{"pneumo"}\hlstd{,} \hlnum{6}\hlstd{)}
\hlstd{tabAss}\hlkwb{<-}\hlkwd{rbind}\hlstd{(tabAss,} \hlkwd{cbind}\hlstd{(Taxonomy, Assembly, Origin))}
\hlkwd{write.table}\hlstd{(tabAss,} \hlkwd{paste0}\hlstd{(home,} \hlstr{"/phylolegio/doc/tabAss.txt"}\hlstd{),} \hlkwc{quote}\hlstd{=}\hlnum{FALSE}\hlstd{,} \hlkwc{row.names}\hlstd{=}\hlnum{FALSE}\hlstd{)}
\end{alltt}
\end{kframe}
\end{knitrout}
......@@ -580,6 +582,20 @@ I will try with PSI-BLAST.
\end{kframe}
\end{knitrout}
For the other species, use of PSI BLAST.
\textit{When using PSI-BLAST the results of a normal BLAST search are aligned and used to construct a pattern of conserved residues. This pattern is used for the next round of searching instead of the original query sequence. The process is repeated (iterated) until a final database search finds no more related sequences. When the process ends in this fashion, it is said to have converged.}
\textbf{TO DO:}
\begin{itemize}
\item install PSI BLAST
\item Use PSI-BLAST
\item Find way to filter within DB.
\item Or make a sub db?
\end{itemize}
\subsection{Get 78 sequences from Gupta species}
......
file species size ncontig comr rocc
942 GCA_900452395.1_50618_B01_genomic Legionella beliardensis 3.5 6 FALSE FALSE
996 GCA_900639825.1_Leg_beliardensis_Wilkinson_1407-AL-H.v1_genomic Legionella beliardensis 3.4 31 FALSE FALSE
271 GCA_002240035.1_ASM224003v1_genomic Legionella clemsonensis 3.2 1 FALSE FALSE
7 GCA_000162755.2_ASM16275v2_genomic Legionella drancourtii 4.1 58 TRUE FALSE
48 GCA_000621525.1_ASM62152v1_genomic Legionella fairfieldensis 2.6 57 FALSE FALSE
1023 GCA_900640125.1_Leg_fairfieldensis_1725-Aus-E.v1_genomic Legionella fairfieldensis 2.6 76 FALSE FALSE
93 GCA_000953135.1_LFA_genomic Legionella fallonii 4.3 3 TRUE FALSE
1001 GCA_900639875.1_Leg_impletisoli_OA1-1.v1_genomic Legionella impletisoli 2.5 20 FALSE FALSE
4 GCA_000091785.1_ASM9178v1_genomic Legionella longbeachae 4.1 2 FALSE TRUE
8 GCA_000176095.1_ASM17609v1_genomic Legionella longbeachae 4 13 FALSE TRUE
267 GCA_002073455.2_ASM207345v2_genomic Legionella longbeachae 4.1 1 FALSE TRUE
270 GCA_002113845.3_ASM211384v3_genomic Legionella longbeachae 4.2 2 FALSE TRUE
439 GCA_004283175.1_ASM428317v1_genomic Legionella longbeachae 4 38 FALSE TRUE
563 GCA_008807315.1_ASM880731v1_genomic Legionella longbeachae 4.3 2 FALSE TRUE
580 GCA_011465255.1_ASM1146525v1_genomic Legionella longbeachae 4.1 3 FALSE TRUE
581 GCA_011465395.1_ASM1146539v1_genomic Legionella longbeachae 4 2 FALSE TRUE
971 GCA_900461575.1_42650_G01_genomic Legionella longbeachae 2.7 2 FALSE FALSE
56 GCA_000756695.1_PRJEB110_assembly_1_genomic Legionella massiliensis 4.3 8 FALSE FALSE
57 GCA_000756815.1_PRJEB6598_assembly_1_genomic Legionella massiliensis 4.3 8 FALSE FALSE
1004 GCA_900639915.1_Leg_nagasakiensis_JCM_15315.v1_genomic Legionella nagasakiensis 2.7 54 FALSE FALSE
58 GCA_000770585.1_ASM77058v1_genomic Legionella norrlandica 3 157 TRUE TRUE
1011 GCA_900639985.1_Leg_rowbothamii_LLAP6.v1_genomic Legionella rowbothamii 3.9 35 FALSE FALSE
307 GCA_003070625.1_ASM307062v1_genomic Legionella taurinensis 3 46 FALSE FALSE
308 GCA_003070645.1_ASM307064v1_genomic Legionella taurinensis 3 46 FALSE FALSE
309 GCA_003070665.1_ASM307066v1_genomic Legionella taurinensis 3 43 FALSE FALSE
310 GCA_003070675.1_ASM307067v1_genomic Legionella taurinensis 3 61 FALSE FALSE
325 GCA_003602125.1_ASM360212v1_genomic Legionella taurinensis 3.2 118 FALSE FALSE
326 GCA_003602175.1_ASM360217v1_genomic Legionella taurinensis 3.3 97 FALSE FALSE
445 GCA_004920375.1_ASM492037v1_genomic Legionella taurinensis 3 89 FALSE FALSE
446 GCA_004920385.1_ASM492038v1_genomic Legionella taurinensis 3 44 FALSE FALSE
448 GCA_004920415.1_ASM492041v1_genomic Legionella taurinensis 3 61 FALSE FALSE
449 GCA_004920475.1_ASM492047v1_genomic Legionella taurinensis 3 46 FALSE FALSE
450 GCA_004920485.1_ASM492048v1_genomic Legionella taurinensis 3 44 FALSE FALSE
451 GCA_004920495.1_ASM492049v1_genomic Legionella taurinensis 3 44 FALSE FALSE
509 GCA_004921725.1_ASM492172v1_genomic Legionella taurinensis 3 46 FALSE FALSE
510 GCA_004921735.1_ASM492173v1_genomic Legionella taurinensis 3 46 FALSE FALSE
963 GCA_900452865.1_50618_H01_genomic Legionella taurinensis 3.1 2 FALSE FALSE
1019 GCA_900640065.1_Leg_taurinensis_Turin_I.v1_genomic Legionella taurinensis 3 41 FALSE FALSE
54 GCA_000701265.1_ASM70126v1_genomic Legionella wadsworthii 3.5 16 FALSE FALSE
964 GCA_900452925.1_42650_E02_genomic Legionella wadsworthii 3.5 2 FALSE FALSE
1017 GCA_900640045.1_Leg_wadsworthii_81-716.v1_genomic Legionella wadsworthii 3.5 24 FALSE FALSE
1022 GCA_900640115.1_Leg_yabuuchiae_OA1-2.v1_genomic Legionella yabuuchiae 2.6 89 FALSE FALSE
Assembly Taxonomy Origin
GCA_001467055.1 Legionella adelaidensis Burstein
GCA_001467525.1 Legionella anisa Burstein
GCA_001467505.1 Legionella birminghamensis Burstein
GCA_001467045.1 Legionella bozemanae Burstein
GCA_001467025.1 Legionella brunensis Burstein
GCA_001467035.1 Legionella cherrii Burstein
GCA_001467545.1 Legionella cincinnatiensis Burstein
GCA_001467585.1 Legionella drozanskii LLAP-1 Burstein
GCA_001467615.1 Legionella erythra Burstein
GCA_001467625.1 Legionella feeleii Burstein
GCA_001467645.1 Legionella geestiana Burstein
GCA_001467695.1 Legionella gratiana Burstein
GCA_001467705.1 Legionella hackeliae Burstein
GCA_001467785.1 Legionella israelensis Burstein
GCA_001467745.1 Legionella jamestowniensis Burstein
GCA_001467765.1 Legionella jordanis Burstein
GCA_001467795.1 Legionella lansingensis Burstein
GCA_001467825.1 Legionella londiniensis Burstein
GCA_001467845.1 Legionella maceachernii Burstein
GCA_001467865.1 Legionella moravica Burstein
GCA_001467895.1 Legionella nautarum Burstein
GCA_001467925.1 Legionella oakridgensis Burstein
GCA_001467945.1 Legionella parisiensis Burstein
GCA_001467955.1 Legionella quateirensis Burstein
GCA_001467975.1 Legionella quinlivanii Burstein
GCA_001468125.1 Legionella rubrilucens Burstein
GCA_001468105.1 Legionella sainthelensi Burstein
GCA_001468135.1 Legionella santicrucis Burstein
GCA_001468025.1 Legionella shakespearei DSM 23087 Burstein
GCA_001468165.1 Legionella spiritensis Burstein
GCA_001468005.1 Legionella steelei Burstein
GCA_001468065.1 Legionella steigerwaltii Burstein
GCA_001468035.1 Legionella tucsonensis Burstein
GCA_001468085.1 Legionella waltersii Burstein
GCA_001467535.1 Legionella worsleiensis Burstein
GCA_900452395.1 Legionella beliardensis gupta
GCA_002240035.1 Legionella clemsonensis gupta
GCA_000162755.2 Legionella drancourtii gupta
GCA_002776555.1 Legionella endosymbiont of Polyplax serrata PsAG gupta
GCA_000621525.1 Legionella fairfieldensis gupta
GCA_000953135.1 Legionella fallonii gupta
GCA_900639875.1 Legionella impletisoli gupta
GCA_000091785.1 Legionella longbeachae gupta
GCA_000756815.1 Legionella massiliensis gupta
GCA_900639915.1 Legionella nagasakiensis gupta
GCA_000770585.1 Legionella norrlandica gupta
GCA_900639985.1 Legionella rowbothamii gupta
GCA_001465875.1 Legionella saoudiensis gupta
GCA_003070665.1 Legionella taurinensis gupta
GCA_000308315.1 Legionella tunisiensis gupta
GCA_000701265.1 Legionella wadsworthii gupta
GCA_900640115.1 Legionella yabuuchiae gupta
GCA_001572745.1 Coxiella burnetii gupta
GCA_000048645.1 Legionella pneumophila Paris pneumo
GCA_000048665.1 Legionella pneumophila Lens pneumo
GCA_000008485.1 Legionella pneumophila Philadelphia pneumo
GCA_000092545.1 Legionella pneumophila Corby pneumo
GCA_000092625.1 Legionella pneumophila Alcoy pneumo
GCA_000306865.1 Legionella pneumophila Lorraine pneumo
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment