Promise repository datasets for defect prediction

4/15/2023

The entire data collection process, from data collection to prediction, will be orchestrated in a user-friendly Java Swing application. Once the data is ready to use, this tool will assist in selecting the appropriate Machine Learning algorithm and performs the desired prediction in the associated field. Data pre-processing, various defect predictions, and other machine learning settings are all in the pipeline. Two defect classifiers are developed namely discretized defect classifier and regression based defect classifier. So far, we have built capability in the tool to collect data dynamically from a GitHub repository. PROMISE defect datasets and machine learning algorithms such as Linear/Logistic Regression, RF, K -Nearest Neighbour, SVM, CART and Neural Networks are use d to build a prediction model. The work presented here introduces a tool that will considerably assist in the generation of a high-quality dataset, starting with raw data collection and progressing on to data pre-processing and validation, and finally to prediction using the selected machine learning algorithm. Data is typically available in its raw form, which is in most circumstances unsuitable for machine learning applications. Historic data is a gold mine for predicting the future with a high level of confidence and accuracy in a certain field. One of the most difficult and time-consuming aspects of Machine Learning is gathering high-quality data that can be used to train the algorithm.

0 Comments

Promise repository datasets for defect prediction

Leave a Reply.

Author

Archives

Categories