Table of Contents

Assignment 2: SEQUENCE ALIGNMENT

Please, do not solve before officially presented in class!
If you think that this homework is not for you (for example, you already implemented Levenshtein distance several times), you might submit BLAST ALIGNMENT instead.

Needleman-Wunsch and Smith-Waterman algorithm

Implement the pairwise global alignment algorithm (Needleman-Wunch) and the pairwise local alignment algorithm (Smith-Waterman) in an arbitrary programming language (e.g., Python, Perl, Java). Required arguments are two input sequences in a FASTA format, a scoring matrix in CSV format (e.g., blosum62 file), and a gap penalization.

Program alignment.sh should accept the following arguments:

argument meaning
-g/-l global/local pairwise alignment
-s1 path to the first sequence
-s2 path to the second sequence
-e path to the score matrix in CSV format
-p gap penalization
-pe gap extension penalization (optional)

Input and output

bash

./alignment.sh -g -s1 A0PQ23.fasta -s2 Q9CD83.fasta -e blosum62.csv -p 4
Runs the Needleman-Wunch algorithm with two input sequences A0PQ23.fasta and Q9CD83.fasta, with the score matrix defined in the blosum62.csv file, and with the gap penalization equal $4$.

bash

./alignment.sh -l -s1 EU078679.fasta -s2 CH954156.fasta -e blastmatrix.csv -p 4
Runs the Smith-Waterman algorithm with two input sequences EU078679.fasta and CH954156.fasta, with the score matrix defined in the blastmatrix.csv file, and with the gap penalization equal $4$.

The final alignment and the corresponding score print to STDOUT on three lines containing:

The result of the second call would be:

TTGACAGTACATAG
TTGA-­AGTTTGTAG
34

Scoring

Your program needs to run in quadratic time and must not allocate more than a quadratic amount of memory. Submitting a code that does not compile may result in significant point reduction.

External libraries

The alignment implementation must be solely your own work. However, you can you any external library of your choice for the following:

Don't forget that your code needs to compile!