Search or add a thesis

Advanced Search (Beta)
Home > Hadoop-Based Algorithm for Inference of Phylogenetic Trees Using Maximum Likelihood Method

Hadoop-Based Algorithm for Inference of Phylogenetic Trees Using Maximum Likelihood Method

Thesis Info

Access Option

External Link

Author

Sardaraz, Muhammad

Program

PhD

Institute

Iqra University

City

Islamabad

Province

Islamabad

Country

Pakistan

Thesis Completing Year

2016

Thesis Completion Status

Completed

Subject

Applied Sciences

Language

English

Link

http://prr.hec.gov.pk/jspui/bitstream/123456789/2827/1/Muhammad_Sardaraz_Computer_Science_2016_SRIqra_univ_28.10.2016.pdf

Added

2021-02-17 19:49:13

Modified

2024-03-24 20:25:49

ARI ID

1676726355406

Similar


Biological sequences consist of A C G and T in a DNA structure and contain vital information of living organisms. This information is used in many applications such as drug design, microarray analysis and phylogenetic trees. Advances in computing technologies, specifically Next Generation Sequencing technologies have increased genomic data at a rapid rate. The increase in genomic data presents significant research challenges in bioinformatics, such as sequence alignment, short read error correction, phylogenetic inference etc. Various tools and algorithms have been proposed for phylogenetic inference. Early algorithms used sequential programs to solve the problem of phylogenetic inference. Improvements were gained in terms of tree accuracy and execution time, however; the programs were still slow, and improvements were needed to infer correct phylogeny in short times. This challenge introduced parallel and distributed processing to the field of bioinformatics. Many tools and programs have been developed based on parallel and distributed computing. This thesis presents algorithmic solutions for phylogenetic inference. Solutions include ‘PhyloDoop’ and ‘SeqCompress’ algorithms. PhyloDoop algorithm is used for inference of phylogenetic trees. The algorithm is based on Maximum Likelihood method, implemented on Hadoop Map/Reduce framework. PhyloDoop is based on clusters i.e. divides the input alignment to clusters, builds trees for each cluster, merges and optimizes all sub-trees and the final tree is also optimized. PhyloDoop is compared to well-known algorithms both on real and simulated datasets. Experiments on real datasets were performed to test likelihood values, execution time, and speedup in distributed environment. The results show better accuracy as compared to other algorithms on most of the datasets. Execution time is also short on most datasets. The proposed algorithm yields better speed up on large datasets. Simulated datasets were used to measure topological accuracy. PhyloDoop is topologically accurate on most datasets with short execution time in comparison to other algorithms. SeqCompress is used to compress DNA sequences in order to reduce memory requirements and execution time. Impressive results are shown in comparison to other algorithms. These results show a gap for efficient usage of compression techniques to infer correct phylogeny with low memory requirements as well as execution time.
Loading...

Similar Thesis

Showing 1 to 20 of 100 entries
TitleAuthorSupervisorDegreeInstitute
PhD
Iqra University, Islamabad, Pakistan
PhD
Iqra University, Islamabad, Pakistan
BS
COMSATS University Islamabad, Islamabad, Pakistan
PhD
Iqra University, Islamabad, Pakistan
PhD
Iqra University, Islamabad, Pakistan
PhD
Iqra University, Islamabad, Pakistan
Mphil
Riphah International University, Islamabad, Pakistan
BS
COMSATS University Islamabad, Islamabad, Pakistan
University of Engineering and Technology, Lahore, Pakistan
PhD
University of Peshawar, Peshawar, Pakistan
PhD
Lahore College for Women University, Lahore, Pakistan
BS
University of Management and Technology, Lahore, Pakistan
REE
COMSATS University Islamabad, Islamabad, Pakistan
PhD
University of the Punjab, Lahore, Pakistan
PhD
University of the Punjab, Lahore, Pakistan
BS
COMSATS University Islamabad, Islamabad, Pakistan
BCS
COMSATS University Islamabad, Islamabad, Pakistan
PhD
National University of Computer and Emerging Sciences, Islamabad, Pakistan
MS
International Islamic University, Islamabad, Pakistan
PhD
Institute of Management Sciences, Peshawar, Pakistan
TitleAuthorSupervisorDegreeInstitute
Showing 1 to 20 of 100 entries

Similar News

Loading...

Similar Articles

Loading...

Similar Article Headings

Loading...