Type: Classification Essays
Sample donated: Marlene Lowe
Last updated: August 28, 2019
AbstractThe explosion of data being generated in Bioinformaticsand allied fields produce real challenges to researchers. Mining usefulbiological information from this data and analysing it has become an importanttask since it give insights into the underlying functionalities and cellularprocesses.
The highly complex nature of biological data demands development ofgood quality pattern recognition algorithms to ensure reliable output. Itsapplications can be seen widely in almost all areas in bioinformatics such asgenomics, proteomics and transcriptomics. Classification and clustering are applied in huge data in areas likegene expression analysis, pathway analysis, gene regulatory networks, diseasemodelling and identification of biomarkers. The advancement in machine learningtechniques, algorithms and tools help researchers in solving various patternrecognition problems effectively. In this review, we discuss number of suchproblems in the field of bioinformaticsand computational biology and various techniques used. Key wordsPattern recognition, computational biology,classification, Artificial Neural Network, clustering and deep learningIntroductionPattern recognition is one of the key strategies bywhich brain performs analogical reasoning of many life problems based on theinformation accumulated through its sense organs.
In a general perspective, pattern recognitioninvolves receiving an input data, analysing it for similar, specific, regularpatterns based on which meaningful interpretation is made. Pattern recognitioncan be employed to make computers execute tasks like humans, even faster andmore accurately, by figuring out actual problems and using a collection ofmathematical, statistical, heuristic and inductive techniques to find solutions.When a computer program is trained to learn the pattern and categorize the data,then it is machine learning or machine pattern recognition. Solutions based on pattern recognition may beemployed almost everywhere and anywhere – medicine, health and pharma industry,agriculture, financial markets, forensic investigations. During the last fewdecades, enormous amount of biological data in different formats has beengenerated using advanced technologies. Moreover, number of databases are alsodeveloped by researchers, which accumulates huge molecular data. Consequently,demand for new computational techniques is also increased for better processingof this data.
Mining the usefulinformation as well as the biological interpretation has become one of the mostimpressive bioinformatics problems. The different processes in the nature areanalysed and many nature inspired algorithms have been derived forcomputational pattern recognition. Studies reveal that biomolecules (DNA, RNA, proteins) insequence form and structural form contain different patterns that arefunctionally relevant. These patterns also known as motifs are very muchinvolved in the characterization of these biomolecules. In proteins, patternsmay also occur for which the elements found in secondary structure.
Helix turnhelix is a widely studied motif that falls in the category of DNA bindingmotifs. Detection techniques of such patterns make use of the structuralfeatures as well 1. Recent studies in drug discovery show that proline richlinear motifs are excellent mediators for intermolecular interactions seen inmany faces of immune response activities, and hence these motifs are consideredas drug targets in immune mediated diseases. Alignment method, local search,heuristic approach etc.
are a few among the applied techniques for this patternidentification task. Within medical science, pattern recognition is the basisfor computer-aided diagnosis (CAD) systems. CAD describes a procedure thatsupports the doctor’s interpretations and findings. Detection of patternsdemands computational techniques that produce optimum results.