Home
Add
Get on Google Play
Home
> Edit
Add/Update Thesis
Title*
Author's Name*
Supervisor's Name
Abstract
Speech Recognition is an active area under research from the last few decades. A number of sophisticated methods have been developed in recent years for improving recognition rate. A speech recognition system consists of two main components, i.e., frontend and back-end. In this thesis, we have introduced new methods to front-end which achieve higher recognition rate. For the front-end, we propose novel spectral features for speech recognition. More specifically this thesis replaces the traditional state of the art feature extraction technique i.e., mel frequency cepstral coefficients (MFCC) with adaptive mel filter bank, which is cognitively-inspired feature extraction approach that constitutes adaptive filter bank after sensing the spectrum of input signal. This work has not only improved the performance of automatic speech recognition system (ASR) but also contributed in three main directions of the ASR field. The first facet is related to improve the spectrogram visualization using adaptive window size selection. Short-time Fourier transform (STFT) is a well known technique, which is used for time-frequency analysis of non-stationary signal. Selection of an appropriate window size become a difficult task when no background information about the input signal is known. A novel empirical model is proposed in this work, which selects the window size adaptively for a narrow band signal using spectrum sensing technique. As fixed model is undesirable for a wide band signals, the proposed model adapts constant-Q transform (CQT). Unlike STFT, CQT provides a varying time frequency resolution. The proposed model not only improves the results of spectrogram visualization but also reduces the computational cost. Proposed model achieves 87.71% of the appropriate window length selection. The proposed model is not only useful in feature extraction from speech signal but it is also equally useful in biomedical signals, music signals and radio signals etc. The second facet relates commercial application of speech recognition. This thesis presents a novel idea that automatically identifies the hearing impairment based on a cognitively inspired feature extraction and speech recognition approach. To the best of authors’ knowledge, this is first attempt to automate pure tone and speech audiometry testing based on speech recognition. The proposed method uses an adaptive filter bank with weighted mel frequency cepstral coefficients for feature extraction. Classification is performed using well known statistical pattern technique i.e., hidden Markov model (HMM). The performance evaluation and comparison with the ground truth (expert audiologist results) and current state of the art techniques have revealed that the proposed method can achieve comparable results automatically. Specifically the overall absolute error of the proposed model when compared with expert audiologist result is less than 4.9 dB and 4.4 dB for pure tone and speech audiometry, respectively. The overall accuracy achieved by the proposed method is 96.67%. The third facet is related to the implementation of proposed feature extraction model for dialect recognition of low resource local language. Traditional methods for dialects recognition such as MFCC and discrete wavelet transform (DWT) work well for high resource languages but the accuracy is not that good for low resource languages. This thesis presents a new approach for Pashto dialects recognition using an adaptive filter bank with MFCC and DWT. This novel approach extracts features using adaptive filter bank in MFCC and DWT followed by classification using statistical pattern matching (HMM) and machine learning techniques K-nearest neighbors (KNN) and support vector machine (SVM) classifiers. Three different models proposed are tested and compared with state of the art techniques. The proposed method achieved an overall accuracy of 88%.
Subject/Specialization
Language
Program
Faculty/Department's Name
Institute Name
Univeristy Type
Public
Private
Campus (if any)
Institute Affiliation Inforamtion (if any)
City where institute is located
Province
Country
Degree Starting Year
Degree Completion Year
Year of Viva Voce Exam
Thesis Completion Year
Thesis Status
Completed
Incomplete
Number of Pages
Urdu Keywords
English Keywords
Link
Select Category
Religious Studies
Social Sciences & Humanities
Science
Technology
Any other inforamtion you want to share such as Table of Contents, Conclusion.
Your email address*