Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:bin:tutorials:hmmer [2019/04/11 12:31]
courses:bin:tutorials:hmmer [2024/02/09 10:17]
127.0.0.1 external edit
Line 1: Line 1:
 +====== HMMER ======
 +This tutorial follows the HMMER User’s Guide written by Sean R. Eddy, Travis J. Wheeler and the HMMER development team. Thanks.
 +===== Problem 1 - Install HMMER =====
 +First of all, we download HMMERv3.1b2 from http://​hmmer.org and unpack it.
 +<​code|bash>​
 +wget http://​eddylab.org/​software/​hmmer3/​3.1b2/​hmmer-3.1b2.tar.gz
 +tar xfv hmmer-3.1b2.tar.gz
 +cd hmmer-3.1b2
 +</​code>​
 +Now we have downloaded the HMMER source code. Now, it's time to compile it.
 +<​code|bash>​
 +./configure
 +make
 +make check
 +</​code>​
 +Probably, we don't have root privileges so add binaries to the PATH variable.
 +<​code|bash>​
 +export PATH=/​path/​to/​hmmer-3.1b2/​src:​$PATH
 +</​code>​
 +
 +===== The programs in HMMER =====
 +- Build models and align sequences (DNA or protein)
 + - hmmbuild - Build a profile HMM from an input multiple alignment.
 + - hmmalign - Make a multiple alignment of many sequences to a common profile HMM
 +- Search protein queries against protein database
 + - phmmer - Search a single protein sequence against a protein sequence database. (BLASTP-like)
 + - jackhmmer - Iteratively search a protein sequence against a protein sequence database. (PSIBLAST-like)
 + - hmmsearch - Search a protein profile HMM against a protein sequence database.
 + - hmmscan - Search a protein sequence against a protein profile HMM database.
 + - hmmpgmd - Search daemon used for hmmer.org website.
 +
 +And many others....
 +
 +
 +=====Searching a protein sequence database with a single protein profile HMM=====
 +===Step 1: build a profile HMM with hmmbuild===
 +Common use of HMMER is to search a sequence database for a protein family of interest (homologues).
 +
 +Look at the file ''​tutorial/​globins4.sto''​
 +<note tip>What can you see there?</​note>​
 +
 +Construct a HMM from an alignment
 +<​code|bash>​
 +hmmbuild globins4.hmm tutorial/​globins4.sto
 +</​code>​
 +Look at the globins4.hmm. Is it what you expected?
 +Now, you have a sequence database to search.
 +
 +
 +===Step 2: search the sequence database with hmmsearch===
 +Run your example search against tutorial/​globins45.fa.
 +<​code|bash>​
 +hmmsearch globins4.hmm tutorial/​globins45.fa > globins4.out
 +</​code>​
 +<note important>​Discuss the results!</​note>​
 +
 +====Single sequence protein queries using phmmer====
 +As BLASTP or FASTA, the phmmer is for searching a single sequence query against a sequence database.
 +
 +<​code|bash>​
 +phmmer tutorial/​HBB_HUMAN tutorial/​globins45.fa > phmmer.out
 +</​code>​
 +Everything about the output is essentially as for hmmsearch .
 +
 +
 +====Searching a profile HMM database with a query sequence====
 +The HMM database can be Pfam, SMART, or TIGRFams, or another database of your choice.
 +
 +===Step 1: create an HMM database flatfile===
 +A flatfile is just a concatenation of individual HMM files. Given this, we firstly build individual hmm files using ''​hmmbuild''​ and concatenate them using ''​cat''​.
 +Let's create a small database from the files of tutorial dir.
 +<​code|bash>​
 +hmmbuild globins4.hmm tutorial/​globins4.sto
 +hmmbuild fn3.hmm tutorial/​fn3.sto
 +hmmbuild Pkinase.hmm tutorial/​Pkinase.sto
 +cat globins4.hmm fn3.hmm Pkinase.hmm > minifam
 +</​code>​
 +In this case, the minifan is our new hmm database. Because of accelaration,​ compress and index the flatfile with hmmpress.
 +<​code|bash>​
 +hmmpress minifam
 +</​code>​
 +See new four binary files in the dir.
 +
 +===Step2: search the HMM database with hmmscan===
 +Now we can analyze sequences using our HMM database and ''​hmmscan''​.
 +<​code|bash>​
 +hmmscan minifam tutorial/​7LESS_DROME
 +</​code>​
 +
 +====TASK 1: find records in Pfam database ====
 +In the previous example, we used three records in the stockhold format: ''​globins4.sto'',​ ''​fn3.sto'',​ and ''​Pkinase.sto''​. However, What can we do wheather we do not have any source records? ​
 +
 +<note tip>​Which database to use?</​note>​
 +
 +Your task is following:
 +  - Choose a profile HMMs database.
 +  - Find three protein families: ''​globin'',​ ''​fn3'',​ and ''​Pkinase''​. Download their alignments ​ as a seed file (contains representative members of the family which are judged to be well aligned) in the stockholm format.
 +  - Construct the HMM database (see the previous example).
 +  - Analyse ''​tutorial/​7LESS_DROME''​ using our HMM database.
 +
 +
 +
 +