Identifying the protein coding regions in DNA using IIR antinotch filter
Abstract
Genomic Signal Processing is an emerging interdisciplinary area. The problem of Identifying Protein Coding Regions in DNA is addressed using signal processing techniques in this work. DNA can be thought of a string formed from the alphabet set A = {A,C,G, T}. It is found that in protein coding regions the symbols have periodicity of 3 [Trifonov and Sussman, 1980], known as period-3 property. In genomic signal processing, this periodicity is used as a cue, to identify the protein coding regions using signal processing techniques like Discrete Fourier Transform (DFT) and Digital Filtering. This is possible only if the symbol sequences are mapped to numbers. In this thesis it is identified that the computational complexity of the filters employed for identifying the protein coding regions presented in the DNA, is directly related to the choice of mapping. A new lower dimensional mapping is also proposed which reduces the computational complexity by half, producing results nearly equal to those produced by a higher dimensional mapping.
Collections
- M Tech Dissertations [923]