Alignment
“It is the comparison of two or more DNA or protein sequence by
searching for a series of individual character or character pattern.”
1) Pairwise
sequence alignment:-
a)Local Alignment:-
· The alignment
stop at the end of region of strong similarly and much high similarity is given
to finding this local region.
· It consider only
match region.
· Dashes in the
sequence indicate sequence is not included in the alignment.
· Vertical bar
indicate identical region in the sequence.
· Highest
matching called seed.
· Smith
waterman Algorithm is used in this alignment.
e.g. BLAST(Basic
local Alignment search tool.)
a)
Global Alignment:-
· In a global alignment
attempt is made to align the entire sequence using all sequence character upto
both ends of each sequence.
· It is
consider match, mismatch and gap region also.
· Sequence that
are quite similar and approximately the same length are suitable candidate is for global alignment.
· Needleman
wunsch Algorithm is used in this alignment.
e.g.FASTA
Principle
method for pairwise sequence alignment:-
a) Dot Matrix
b) Dynamic
Programming algorithm
a)
Dot Matrix:
This method displays any possible sequence
alignment as diagonal on the matrix.Dot matrix analysis can readily reveal the
presence of interaction, Deletion and direct and indirect repeat that are more
difficult to find out by the other method .This method was first described by Gibbs and McIntyre
(1970).
Features of Dot matrix:
The Dot
matrix should be visible on Computer terminal thus providing an interactive
environment so that different types of analysis.
Use of colour
dots can enhances the detection of region of similarity.
Methods of Dot Matrix:-
1)In a Dot Matrix method for sequence
comparison one sequence A is listed across the top of the page and the other
sequence B is listed down the side.
2) Starting with the first character B the
comparison then move across the page in the first raw and places a dot in any
column ,where the character in A is the same.
3) The second character in B is then compared
to the entire A sequence and the dot is placed in row two wherever the match
occurs.
4)This process is continued until the page is
filled with dots representing all the possible matches of A character with B
character.
5)Detection of matching region may be include
by filtering out random matches in the Dot matrix.
6)A large window size is generally used for DNA
sequence than for protein sequence, because of the number of random matches
expected between unrelated sequence is much greater due to the use of only 4 DNA symbol as compared to
20amino acid symbols.
Sequence A: AGCTAGGA
Sequence B: CACTAGGC
7) To maximize the number of matches the resulting
alignment could be:-
—AGCTAGGA—
CA—CTAGG—C
b) Dynamic Programming algorithm:-
1)It
is a method of sequence alignment ,that can take gaps into account but required
a manageable number of comparison.
2)DNA is an efficient recursive method to search
thought all the possible alignment and find the one with an optimal score.
3)DNA usually consists of the following three
component .
i)Recursion relation
ii)Tabular
computation
iii)Traceback
4)Needleman and Wunsch first introduced a
dynamic programming algorithm for comparing two sequence in 1970.
--Shweta Adsule


