Search
TAATGCCATGGGATGTT
TGGCA
GCATTGCAA
TGCAAT
CAATT
ATTTGAC
In this tutorial, we are going to de-novo assembly a genome of an unknown organism. First, download the read data:
bash
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR292/SRR292770/SRR292770_1.fastq.gz wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR292/SRR292770/SRR292770_2.fastq.gz
zcat
zless
zmore
zless SRR292770_1.fastq.gz
@
+
Download and unpack the Velvet assembler. This algorithm was proposed here: https://doi.org/10.1101/gr.074492.107.
wget http://www.ebi.ac.uk/~zerbino/velvet/velvet_1.2.10.tgz tar zxvf velvet_1.2.10.tgz
Now build the assembler.
cd velvet_1.2.10 make MAXKMERLENGTH=60 OPENMP=1 cd ..
At this point, we are ready to run the assembly algorithm. Velvet first calculates hashes, using velveth command. Then velvetg command is used for deBruijn graph construction. Run
velveth
velvetg
./velvet_1.2.10/velveth ./velvet_1.2.10/velvetg
-cov_cutoff 2.81
You can find out how many contigs were produced by running
cat <out_dir_35>/contigs.fa
Change k and other settings of the Velvet assembler. Watch how they influence assembly results.