Software defect prediction techniques are being focused by many researchers due to its
effectiveness for cost reduction in testing process. Most of the software defect dataset contains
uncleaned, noisy, high dimensional and imbalance data. These problems reduce the prediction
accuracy of a classifier. In this paper, a framework is proposed which combines approaches to
deal with all these problems. This framework comprises of four stages 1) data preprocessing 2)
feature selection (FS) 3) class balancing, and 4) classification through ensemble learning.
Normalization is performed on cleaned datasets. Multilayer perceptron (MLP) is used as subset
evaluator in FS process with six search methods Best first (BF), Greedy Stepwise (GS), Genetic
Search (GA), Particle swam Optimization Search (PSO), Rank Search (RS) and Linear forward
selection (LFS). Resample and SMOTE algorithm are used for class balancing. In classification
stacking ensemble model is applied to build model on 80% of the input data. Here meta classifier
is set to MLP and base classifiers include decision tree (J48), Random forest (RF), Support vector
machine (SVM), K nearest neighbor (kNN) and Bayes Net (BN). Parameter tuning of meta and
base classifier is also performed. Performance is evaluated on NASA MDP using precision,
recall, F-measure, MCC, ROC and accuracy. Results have shown that the proposed method
showed significant improvement is defect prediction compared to base classifiers.