====== Tutorial 11 - Protein structure, the MODELLER software ====== ===== Recap ===== Make sure you can answer the following questions: * Describe the levels of protein structure. * Explain in general the idea of the "Branch and Bound" method. * Explain the meaning of the words when used for genes: analog, homolog, paralog, ortholog and xenolog. * What is a protein ligand? ===== Homology modeling - protein structure prediction exercise ===== A simple, although not always reliable, way to discover the secondary structure of a peptide sequence is to look up a protein with similar primary sequence in a database. Let us try this! The task is to obtain the secondary structure of the following peptide sequence: ''HYLCKYVINAIPPTLTAKIHFRPELPAERNQLIQRLA'' - Go to [[https://blast.ncbi.nlm.nih.gov/Blast.cgi]] and click "Protein blast". - Enter the sequence and enter "Homo sapiens (taxid:9606)" as organism. - Click the blast button and wait. This may take up to several minutes. - Look for the best matching protein. It should be: "monoamine oxidase A" - Enter this protein name to [[https://www.uniprot.org/uniprot/|UniProt]]. /********** pozor, az treti shoda v poradi ma tuspravnou delku a muze poslouzit k zarovnani **********/ - Check whether the result has a secondary sequence annotation and find the position respective to the BLAST match. Use the above-described procedure to learn most about the following peptide sequence: ''TEYAINKLRQLYVLRC''. A hint: the sequence is a part of a frequent [[https://en.wikipedia.org/wiki/Protein_domain|protein domain]]. /** * It is a part of SH2 domain. * SH2 doména (Src-homology 2 domain) je strukturní doména vyskytující se v různé míře u všech eukaryotických organismů; je typická tím, že se váže na fosforylovaný tyrosin (fosfotyrosin, pY). Je součástí celé řady především signálních bílkovin v buňce. Také je součástí Src onkogenu, který může způsobit rakovinné bujení. * It was taken from: https://www.pnas.org/doi/10.1073/pnas.011577898, Fig.1 (alphaA and betaB, the first two proteins JAK1 and JAK2 merged). * BLASP finds: Tyrosine-protein kinase JAK1, the total length 1154, match with positions 446-466. * JAK1 patří mezi tyrozinkinázy, t.j. enzymy ze skupiny proteinkináz, které katalyzují přenos fosfátové skupiny (fosforylace) z nukleosidtrifosfátů (většinou ATP) na aminokyselinu tyrozin v proteinech. Fosforylace je nejčastější posttranslační modifikací proteinů a má důležitou funkci v regulaci mnoha buněčných signálních drah. * Uniprot Tyrosine-protein kinase JAK1 record: * Molecule processing: check that the length is the same, * Secondary structure: 446 ... helix starts, 463 ...bsheet starts, * Domains and Repeats: 439 – 544 ... SH2. */ ===== Threading exercise ===== Recall the branch-and-bound threading algorithm from the lecture. Suppose we have three segments (i, j, k), each of which includes three amino acids. For a given sequence there are three possible starting positions for each segment. (i ∈ {2,3,4}, j ∈ {8,9,10}, k ∈ {13,14,15}) We will be using the simple lower bound: {{:courses:bin:tutorials:lb.png?400}} Suppose that you are given the following values for the scores of the individual segments and the scores for segment interactions: ^ i ^ j ^ k ^ |g1(i,2) = 5 | g1(j,8) = 9| g1(k,13) = 3| |g1(i,3) = 2 | g1(j,9) = 7| g1(k,14) = 4| |g1(i,4) = 8 | g1(j,10) = 6| g1(k,15) = 1| ^i/j ^ j/k ^ i/k ^ |g2(i,j,2,8) = 1|g2(j,k,8,13) = 7|g2(i,k,2,13) = 1| |g2(i,j,2,9) = 2|g2(j,k,8,14) = 8|g2(i,k,2,14) = 2| |g2(i,j,2,10) = 2|g2(j,k,8,15) = 7|g2(i,k,2,15) = 5| |g2(i,j,3,8) = 5|g2(j,k,9,13) = 1|g2(i,k,3,13) = 5| |g2(i,j,3,9) = 6|g2(j,k,9,14) = 6|g2(i,k,3,14) = 6| |g2(i,j,3,10) = 4|g2(j,k,9,15) = 8|g2(i,k,3,15) = 4| |g2(i,j,4,8) = 7|g2(j,k,10,13) = 11|g2(i,k,4,13) = 1| |g2(i,j,4,9) = 3|g2(j,k,10,14) = 12|g2(i,k,4,14) = 2| |g2(i,j,4,10) = 4|g2(j,k,10,15) = 13|g2(i,k,4,15) = 4| Using this information, **compute the optimal threading**. ===== MODELLER overview ===== The purpose of this tutorial is to put our hands on a software for comparative protein structure modelling, namely the MODELLER. We will go through a basic tutorial for this software. At the end this tutorial, you should have a better understanding of what such software is capable of. ===== Installation ===== Download the MODELLER version for your operating system from: [[https://salilab.org/modeller/download_installation.html]] (Note: It is also available from official repositories of some GNU/Linux distributions.) You need to register yourself in order to obtain a license here: [[https://salilab.org/modeller/registration.html]] You should provide your university e-mail in the registration form. ===== The tutorial ===== We will follow this tutorial from the MODELLER webpages: [[https://salilab.org/modeller/tutorial/basic.html]]. For those who are interested to learn more about MODELLER, there are also advanced tutorials: [[https://salilab.org/modeller/tutorial/]] ===== See also ===== If you are interested, you may also have a look at the [[https://zhanglab.ccmb.med.umich.edu/I-TASSER/|I-TASSER]] software. A [[http://www.bpc.uni-frankfurt.de/guentert/wiki/images/b/b1/180625_Tutorial_Modelling.pdf|tutorial]] on homology modelling from the university of Frankfurt. ===== References ===== Branch and bound threading example taken from [[https://www.biostat.wisc.edu/bmi776/spring-17/lectures/threading.pdf]] {{ :courses:bin:tutorials:threading.pdf |}}