Home
Add
Get on Google Play
Home
> Edit
Add/Update Thesis
Title*
Author's Name*
Supervisor's Name
Abstract
Biological sequences consist of A C G and T in a DNA structure and contain vital information of living organisms. This information is used in many applications such as drug design, microarray analysis and phylogenetic trees. Advances in computing technologies, specifically Next Generation Sequencing technologies have increased genomic data at a rapid rate. The increase in genomic data presents significant research challenges in bioinformatics, such as sequence alignment, short read error correction, phylogenetic inference etc. Various tools and algorithms have been proposed for phylogenetic inference. Early algorithms used sequential programs to solve the problem of phylogenetic inference. Improvements were gained in terms of tree accuracy and execution time, however; the programs were still slow, and improvements were needed to infer correct phylogeny in short times. This challenge introduced parallel and distributed processing to the field of bioinformatics. Many tools and programs have been developed based on parallel and distributed computing. This thesis presents algorithmic solutions for phylogenetic inference. Solutions include ‘PhyloDoop’ and ‘SeqCompress’ algorithms. PhyloDoop algorithm is used for inference of phylogenetic trees. The algorithm is based on Maximum Likelihood method, implemented on Hadoop Map/Reduce framework. PhyloDoop is based on clusters i.e. divides the input alignment to clusters, builds trees for each cluster, merges and optimizes all sub-trees and the final tree is also optimized. PhyloDoop is compared to well-known algorithms both on real and simulated datasets. Experiments on real datasets were performed to test likelihood values, execution time, and speedup in distributed environment. The results show better accuracy as compared to other algorithms on most of the datasets. Execution time is also short on most datasets. The proposed algorithm yields better speed up on large datasets. Simulated datasets were used to measure topological accuracy. PhyloDoop is topologically accurate on most datasets with short execution time in comparison to other algorithms. SeqCompress is used to compress DNA sequences in order to reduce memory requirements and execution time. Impressive results are shown in comparison to other algorithms. These results show a gap for efficient usage of compression techniques to infer correct phylogeny with low memory requirements as well as execution time.
Subject/Specialization
Language
Program
Faculty/Department's Name
Institute Name
Univeristy Type
Public
Private
Campus (if any)
Institute Affiliation Inforamtion (if any)
City where institute is located
Province
Country
Degree Starting Year
Degree Completion Year
Year of Viva Voce Exam
Thesis Completion Year
Thesis Status
Completed
Incomplete
Number of Pages
Urdu Keywords
English Keywords
Link
Select Category
Religious Studies
Social Sciences & Humanities
Science
Technology
Any other inforamtion you want to share such as Table of Contents, Conclusion.
Your email address*