Home
Add
Get on Google Play
Home
> Edit
Add/Update Thesis
Title*
Author's Name*
Supervisor's Name
Abstract
Time-aligned and labeled speech at sub-word level is required to develop spoken language technology components. Determining time boundaries of sub word units of speech and labelling those, is the speech segmentation problem. Manual human-labeling is considered to be the most accurate, which however requires significant amount of time when large amount of speech has to be dealt. The evidences which humans employ are based on knowledge of acoustic-phonetics and at very basic level works on spectrograms based techniques. Based on a hypothesis that computers can also segment speech automatically if evidence which human experts utilizes are used, leads us towards time effective automatic speech segmentation. In this thesis unsupervised automatic time-alignment of speech at sub-word level is carried out based on the pieces of information which spectrograms carry. The speech spectrogram engineered in this thesis does not possess information of vocal excitations and capture dynamics of vocal tract only. The novel feature is found suitable for segmentation problem and utilizes both forward and inverse characteristics of vocal tract (FICV). Additionally to evaluate the suitability of a feature extraction technique for speech segmentation task, a framework has also been developed. In the thesis, speech segmentation is carried out on indigenously developed Classical Arabic (CA) dataset and therefore becomes first scheme of its kind for CA which is an under resourced language in speech technology. The performance of FICV based speech segmentation scheme is compared and shown to be significantly better than standard unsupervised and supervised techniques both in terms of error-rates and alignment accuracies. Reduction of 12.29% in error rates is achieved with FICV based feature when compared with standard unsupervised technique. Carrying out supervised segmentation requires a basic sub-word level recognizer, which labels and aligns speech. In this connection a Hidden Markov Model (HMM) based speech recognizer is trained. The acoustic modeling is carried using a discriminative technique which shows better recognition accuracies of up to 4% than the non-discriminative technique. Thesis also verifies that using manually-labeled data for training acoustic models can further improve recognition accuracies by 3-4%. In this regard, thesis carries details of experimental steps which can also serve as guideline for developing an automatic speech recognizer for CA.
Subject/Specialization
Language
Program
Faculty/Department's Name
Institute Name
Univeristy Type
Public
Private
Campus (if any)
Institute Affiliation Inforamtion (if any)
City where institute is located
Province
Country
Degree Starting Year
Degree Completion Year
Year of Viva Voce Exam
Thesis Completion Year
Thesis Status
Completed
Incomplete
Number of Pages
Urdu Keywords
English Keywords
Link
Select Category
Religious Studies
Social Sciences & Humanities
Science
Technology
Any other inforamtion you want to share such as Table of Contents, Conclusion.
Your email address*