Search or add a thesis

Advanced Search (Beta)
Home > Imputation of Missing Values in the In/Out Procedure of Random Forest

Imputation of Missing Values in the In/Out Procedure of Random Forest

Thesis Info

Access Option

External Link

Author

Amjad Ali

Program

PhD

Institute

University of Peshawar

City

Peshawar

Province

KPK

Country

Pakistan

Thesis Completing Year

2019

Thesis Completion Status

Completed

Subject

Statistics

Language

English

Link

http://prr.hec.gov.pk/jspui/bitstream/123456789/10235/1/Amjad%20Ali_UoP_2019.pdf

Added

2021-02-17 19:49:13

Modified

2024-03-24 20:25:49

ARI ID

1676726440231

Asian Research Index Whatsapp Chanel
Asian Research Index Whatsapp Chanel

Join our Whatsapp Channel to get regular updates.

Similar


The performance of a classifier can affect to a great extent by the presence of missing values in a dataset. In literature, several methods have been proposed to treat missing data and the one used more frequently is deleting instances containing at least one missing value of a feature. In this part of the study we compare the three methods for dealing with missing values to evaluate the effect of misclassification error rate on the non-parametric classifier, the case deletion method, the simple random imputation and the modified random imputation procedure. The classifiers considered were the conventional random forest and the In/Out procedure of the random forest. The missing data problem is common and often unavoidable especially when dealing with large data sets from several real-world sources. Many new computationally tools have been developed to tackle missing data problems. In some cases, the sought after missing data processes engage temporary removal or surrogate of missing data. Existing methods have been successfully applied to well-defined parametric models, however, the usefulness of these models has yet to establish for tree-based models. The problem of missing value, out-of-bag error and misclassification rates in imbalanced data are difficult to deal in Random Forest technique. In this study, a new imputation method has been proposed for In/Out procedure of Random Forest. The proposed method does not depend on the missing data mechanisms which is the principal advantages of this method. This rectifies disadvantages of all other imputation methods its performance has been evaluated and compared with non-missing data sets. It is concluded that new proposed method reduced the Out-Of-Bag error in case of missing values using different Random Forest procedure.
Loading...
Loading...

Similar Books

Loading...

Similar Chapters

Loading...

Similar News

Loading...

Similar Articles

Loading...

Similar Article Headings

Loading...

جدوں اویس دا ٹٹا دند

جدوں اویس دا ٹُٹا دند
ہوئے حضوری وچ پسند
رہن عاشق توں پاسے پاسے
ہور نہیں دیندے کوئی گزند
عشق توں رہندے دور دراڈے
جیہڑے بندے عقل مند
سوہنے ڈگدے سوہنیاں اتے
کر کے ساڈا ساہ بند
ہمدردی تے خدمت سیوا
ایہو نیکاں دی ہے ، ھند
دکھاں درداں دے جو ساتھی
دنیا اتے بندے چند
ڈرے حنیف خدا توں خبرے
کیہڑا اوہنوں رنگ پسند

Comparative Analysis and Best Practices in Islamic Education in Modern Islamic World: Lessons and Insights for Reforms in Pakistani Madrasah System

It is important to know that how the whole system of Madrasah education was evolved and what were various trends which contributed in shaping the whole system of religious Education in the Muslim world, particularly. I this article I will be presenting a comparative analysis of some important Religious Education systems prevalent in prominent Muslim Countries to demonstrate that how same institutions can develop on different lines due to application of certain approaches by state. Here I will be making a comparison of five Muslim countries namely Egypt, Saudi Arabia, Turkey, Indonesia and Bangladesh. Under this comparison I would try to illustrate briefly that how different contexts actually shape and direct the overall approach, methodology and pedagogical methods of these institutions. And I will try Here I will make comparison of four major religious education institutions working in four different Muslim countries namely

Statistical Analysis of Paired Comparison Models Through Bayesian Approach

Bayesian statistics provides a theory of inference which enables us to narrate the results of observation with hypothetical predictions and it provides the only generic tool for incorporating new experimental evidence and updating the existing information. In most of the pragmatic situations in Statistics, we have to deals with comparisons. One such comparing technique is the paired comparisons. The method of paired comparison has been widely employed to remove some of the difficulties involved in the simultaneous comparison of several objects. This method is being used in experimentation and research methodologies in which subjective judgment is involved. So it has become demanding to tract the attention of many of the Bayesian analytics. In recent years, many models for paired comparisons have been devised. The present study contributes to the theory of Bayesian Statistics by presenting Bayesian analysis for four different paired comparison models: the Davidson model with order effect, the Rao-Kupper model with order effect, the van Barren model VI and the amended Davidson model. For the analysis, both the noninformative and informative priors are used. The joint posterior distributions and the marginal posterior distributions of the parameters of the models are derived, the posterior estimates (means and modes) of the parameters, the predictive probabilities for future single paired comparison and the posterior probabilities for comparing the two parameters are calculated. The use of the Gibbs sampling procedure is also given in this study. The analysis has been performed for three and four treatments. An interesting amendment has been made in the Davidson model to accommodate the no preference category for those respondents who genuinely have no preference as well as those who have not been able to distinguish between the two treatments/objects. We give the Bayesian analysis of the amended model using both the noninformative and the informative priors. For using the informative prior, the hyperparameters are elicited through the prior predictive distribution. Those values of the hyperparameters are elicited at which the difference between the confidence levels characterized by the hyperparameters in prior predictive distribution and the elicited confidence levels of expert is the minimum. For the analysis, the entire calculation of the posterior estimates, the predictive and the preference probabilities and the marginal distributions along with their graphical presentations as well as the posterior probabilities for testing of hypotheses of comparing parameters is carried out mainly in SAS package. For the novelty of our work, an assessment that has been done by comparing the posterior estimates, the predictive probabilities for future single paired comparison and the posterior probabilities of hypotheses for comparing parameters of the said three models has also been included. The small data set is also considered for the analysis of the models. Finally, some ideas for future research has also been proposed and appendices carrying some important programs designed in SAS and Mathematica packages have been added.