Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
CIRI2021 - Get consensus sequences from VCF
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
jplantad
CIRI2021 - Get consensus sequences from VCF
Commits
cc649762
Commit
cc649762
authored
May 25, 2021
by
jplantad
Browse files
Options
Downloads
Patches
Plain Diff
compress, index and merge
parent
61d48924
No related branches found
No related tags found
No related merge requests found
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
compress_vcf_to_gz.sh
+36
-0
36 additions, 0 deletions
compress_vcf_to_gz.sh
generate_index.sh
+17
-0
17 additions, 0 deletions
generate_index.sh
merge_fastafiles.sh
+14
-0
14 additions, 0 deletions
merge_fastafiles.sh
with
67 additions
and
0 deletions
compress_vcf_to_gz.sh
0 → 100755
+
36
−
0
View file @
cc649762
#!/bin/bash
# set home directory
home
=
"/home/stagiaire/Bureau/phylogenetics/"
cd
${
home
}
# list the vcf that need to be compressed
data_999_folder
=
"/home/stagiaire/Bureau/gitlab/data_fixed_999/"
vcf_list
=
$(
ls
${
data_999_folder
}
)
# check the list
echo
${
vcf_list
}
# create a new folder that will contain the gz files, and get into it
mkdir
-p
data_999_gz
cd
data_999_gz/
# check working directory
pwd
for
vcf
in
${
vcf_list
}
do
echo
-e
"#####
\n
Processing "
${
vcf
}
# suppress the GL line in the VCF file header and save the output into a temporary VCF file
sed
'/^##FORMAT=<ID=GL/d'
${
data_999_folder
}${
vcf
}
>
${
vcf
}
_tmp.vcf
echo
"temporary VCF file created"
# compress the temporary VCF file
bcftools view
${
vcf
}
_tmp.vcf
-Oz
-o
${
vcf
}
.gz
echo
"compressed file computed"
# remove the temporary VCF file
rm
${
vcf
}
_tmp.vcf
echo
"temporary VCF file deleted"
echo
-e
${
vcf
}
" processed.
\n
#####"
done
# check that the compressed files are in the folder
ls
-l
This diff is collapsed.
Click to expand it.
generate_index.sh
0 → 100755
+
17
−
0
View file @
cc649762
#!/bin/bash
# generate an index for a VCF file
# set home directory
home
=
"/home/stagiaire/Bureau/phylogenetics/data_999_gz/"
cd
${
home
}
# list VCF files
vcf_list
=
$(
ls
)
for
vcf
in
${
vcf_list
}
do
# index the VCF file
bcftools index
-f
${
vcf
}
-o
${
vcf
}
.csi
echo
"##### "
${
vcf
}
" done."
done
This diff is collapsed.
Click to expand it.
merge_fastafiles.sh
0 → 100755
+
14
−
0
View file @
cc649762
#!/bin/bash
# merging fasta files of all individuals for each gene
# set home directory
home
=
"/home/stagiaire/Bureau/phylogenetics/data_sequences/pon/"
cd
${
home
}
indiv_list
=
$(
ls
*
renamed_AB.fa
)
for
indiv
in
${
indiv_list
}
do
cat
${
indiv
}
>>
"BST2_ponAbe2_all.fa"
done
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment