Search
Calculate the score of the following alignment
$$\begin{array}{l} \mathtt{MQPILL\_G} \\ \mathtt{MLR\_LL\_G} \\ \mathtt{MK\_ILLL\_} \\ \mathtt{MPPVLLI\_} \end{array}$$
Calculate the consensus sequence.
[Adapted from http://www.bii.a-star.edu.sg/docs/education/lsm5192_04/Multiple%20Sequence%20Alignment%20Progressive%20Approaches.pdf. ]
Calculate multiple sequence alignment using the star approach.
$$\begin{aligned} s_1 &= \mathtt{CCTGCTGCAG} \\ s_2 &= \mathtt{GATGTGCCG} \\ s_3 &= \mathtt{GATGTGCAG} \\ s_4 &= \mathtt{CCGCTAGCAG} \\ s_5 &= \mathtt{CCTGTAGG} \end{aligned}$$ Match is for $+1$, mismatches and indels for $-1$.
The following code may help you to calculate the pairwise alignments.
R
source("https://bioconductor.org/biocLite.R") biocLite("Biostrings") library(Biostrings) s1 <- "CCTGCTGCAG" s2 <- "GATGTGCCG" s3 <- "GATGTGCAG" s4 <- "CCGCTAGCAG" s5 <- "CCTGTAGG" submatrix <- nucleotideSubstitutionMatrix(match = 1, mismatch = -1, baseOnly = TRUE) pairwiseAlignment(s1, s2, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s1, s3, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s1, s4, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s1, s5, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s2, s3, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s2, s4, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s2, s5, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s3, s4, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s3, s5, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s4, s5, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE)
Align group $$\begin{aligned} s_1 &= \mathtt{ATTGCCATT\_\_} \\ s_2 &= \mathtt{ATC\_CAATTTT} \end{aligned}$$ with group $$\begin{aligned} s_1 &= \mathtt{ATGGCCATT} \\ s_2 &= \mathtt{ATCTTC\_TT} \end{aligned}$$ using the approach of CLUSTALW algorithm. Align groups based on two most similar sequences considering matches for $+1$ and mismatches and gaps for $-1$.
[Source http://www.bii.a-star.edu.sg/docs/education/lsm5192_04/Multiple%20Sequence%20Alignment%20Progressive%20Approaches.pdf. ]
The following code may help you to decide which two sequences will guide the alignment.
source("https://bioconductor.org/biocLite.R") biocLite("Biostrings") library(Biostrings) s1 <- "ATTGCCATT" s2 <- "ATCCAATTTT" s3 <- "ATGGCCATT" s4 <- "ATCTTCTT" submatrix <- nucleotideSubstitutionMatrix(match = 1, mismatch = -1, baseOnly = TRUE) pairwiseAlignment(s1, s3, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s1, s4, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s2, s3, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE) pairwiseAlignment(s2, s4, substitutionMatrix = submatrix, gapOpening = 0,gapExtension = -1, scoreOnly = FALSE)
Use BLAST algorithm to find the local alignment of query sequence $$ \mathtt{IHNWALN} $$ in database $$ \mathtt{AFGIAAAHDWALNW}. $$ Use $k=3$, a threshold for high scoring words $T=20$, and BLOSUM 62 scoring matrix.
On the second tutorial, we assembled a sequence. If you did not obtain results in reasonable time, you could use this file. Use NCBI BLAST page to find what species it is. You do not need to search for the whole sequence; one contig is enough.