====== HMMER ====== This tutorial follows the HMMER User’s Guide written by Sean R. Eddy, Travis J. Wheeler and the HMMER development team. Thanks. ===== Problem 1 - Install HMMER ===== First of all, we download HMMERv3.1b2 from http://hmmer.org and unpack it. wget http://eddylab.org/software/hmmer3/3.1b2/hmmer-3.1b2.tar.gz tar xfv hmmer-3.1b2.tar.gz cd hmmer-3.1b2 Now we have downloaded the HMMER source code. Now, it's time to compile it. ./configure make make check Probably, we don't have root privileges so add binaries to the PATH variable. export PATH=/path/to/hmmer-3.1b2/src:$PATH ===== The programs in HMMER ===== - Build models and align sequences (DNA or protein) - hmmbuild - Build a profile HMM from an input multiple alignment. - hmmalign - Make a multiple alignment of many sequences to a common profile HMM - Search protein queries against protein database - phmmer - Search a single protein sequence against a protein sequence database. (BLASTP-like) - jackhmmer - Iteratively search a protein sequence against a protein sequence database. (PSIBLAST-like) - hmmsearch - Search a protein profile HMM against a protein sequence database. - hmmscan - Search a protein sequence against a protein profile HMM database. - hmmpgmd - Search daemon used for hmmer.org website. And many others.... =====Searching a protein sequence database with a single protein profile HMM===== ===Step 1: build a profile HMM with hmmbuild=== Common use of HMMER is to search a sequence database for a protein family of interest (homologues). Look at the file ''tutorial/globins4.sto'' What can you see there? Construct a HMM from an alignment hmmbuild globins4.hmm tutorial/globins4.sto Look at the globins4.hmm. Is it what you expected? Now, you have a sequence database to search. ===Step 2: search the sequence database with hmmsearch=== Run your example search against tutorial/globins45.fa. hmmsearch globins4.hmm tutorial/globins45.fa > globins4.out Discuss the results! ====Single sequence protein queries using phmmer==== As BLASTP or FASTA, the phmmer is for searching a single sequence query against a sequence database. phmmer tutorial/HBB_HUMAN tutorial/globins45.fa > phmmer.out Everything about the output is essentially as for hmmsearch . ====Searching a profile HMM database with a query sequence==== The HMM database can be Pfam, SMART, or TIGRFams, or another database of your choice. ===Step 1: create an HMM database flatfile=== A flatfile is just a concatenation of individual HMM files. Given this, we firstly build individual hmm files using ''hmmbuild'' and concatenate them using ''cat''. Let's create a small database from the files of tutorial dir. hmmbuild globins4.hmm tutorial/globins4.sto hmmbuild fn3.hmm tutorial/fn3.sto hmmbuild Pkinase.hmm tutorial/Pkinase.sto cat globins4.hmm fn3.hmm Pkinase.hmm > minifam In this case, the minifan is our new hmm database. Because of accelaration, compress and index the flatfile with hmmpress. hmmpress minifam See new four binary files in the dir. ===Step2: search the HMM database with hmmscan=== Now we can analyze sequences using our HMM database and ''hmmscan''. hmmscan minifam tutorial/7LESS_DROME ====TASK 1: find records in Pfam database ==== In the previous example, we used three records in the stockhold format: ''globins4.sto'', ''fn3.sto'', and ''Pkinase.sto''. However, What can we do wheather we do not have any source records? Which database to use? Your task is following: - Choose a profile HMMs database. - Find three protein families: ''globin'', ''fn3'', and ''Pkinase''. Download their alignments as a seed file (contains representative members of the family which are judged to be well aligned) in the stockholm format. - Construct the HMM database (see the previous example). - Analyse ''tutorial/7LESS_DROME'' using our HMM database.