On the application of genetic programming for software engineering predictive




















In this case study, a dataset consisting of PPI complexes related to cancer was used to construct a generic PPI predicting model with good PPI prediction accuracy and generalization ability. A high correlation coefficient CC magnitude of 0. To validate the discriminatory nature of the model, it was applied on a dataset of diabetes complexes where it yielded significantly low CC values. Thus, the GP model developed here serves a dual purpose: a a predictor of the binding energy of cancer related PPI complexes, and b a classifier for discriminating PPI complexes related to cancer from those of other diseases.

The independent variables included 19 different soft- tions were rediscovered by GP. Also GP and neural networks NN ware metrics for each object. Both genetic algorithms and GP were showed superiority over multiple linear regression in case of non- used to get best splitting of attribute domains for the decision tree linear relationship between the independent variables. The conclu- and to get a best decision tree. The GA chromosome was repre- sion with respect to GP was that it provided similar or better values sented by a possible splitting for all attributes.

These models improvement over classical regression in 2 out of 12 data sets. GP were then validated on 5 validation data sets. The models that per- performed well, pred 0. The application of this technique to seven different 1 The data sets in Table 4 are taken at a coarser level, e. Please cite this article in press as: Afzal, W. Number of faults for Maximization of the best percentage Ranking based on lines of Number of faults accounted by each software module of actual faults averaged over the code different cut-off percentiles percentiles level of interest and controlling the tree size p Khoshgoftaar et al.

Article Data sets No. Sampling of training and testing sets Industrial I or academic A Data sets public or private Robinson and McIlroy 1 records for training and 60 records for? Private testing Khoshgoftaar et al. It was also found that size linear regression. The study marginal return with clarity. In comparison with linear and log—log regres- The data sets in Table 6 are taken at a coarser level, e.

Dolado et al. Software fault prediction and reliability growth faulty modules. The study showed that grammar-guided either the fault content or software reliability growth. Schick—Wolverton, Goel—Okumoto, Jelinki— tors. Using multiple evaluation measures of were performed using a NASA data set of C functions.

However, it is not clear from the study how neural networks all these equations simultaneously. Also it is not clear from the study what sampling strategy In another study, Kaminsky and Boetticher , the same was used to split the data set into training and testing set.

The study pensate for data skewness. In comparison with stan- frequency of most occurring instance in this case functions with dard GP and the same reliability growth models as used in the zero faults. Using the found to be more accurate. Sampling of training and testing sets Industrial I or academic A Data sets public or private Dolado et al.

Using prequential likelihood ratio, adjusted mean square error for models based on time. The comparisons were done applied the genetic operators in each generation.

For coverage-based models, an addi- as maximum deviation, average bias, average error, prediction er- tional Kolmogorov—Smirnov test was also used. Sampling of training and testing sets Industrial I or academic A Data sets public or private Kaminsky and Boetticher 1?

I Public Kaminsky and Boetticher 1? I Private Zhang and Yin 1? This indi- answer the research question from each of the primary studies with- cates that further problem-dependent objectives can possibly be in software fault prediction and reliability growth. Discussion and areas of future research comparison groups need to increase. GP needs to be compared with a peting techniques.

Based on our investigation, this research ques- more representative set of techniques that have been found suc- tion is answered depending upon the prediction and estimation cessful in earlier research—only then are we be able to ascertain of the attribute under question.

We see from Table 4 that all the data sets were private. In this regards, the publication of private data sets needs to be encour- 1. Publication of data sets would encourage other researchers 2.

Software CES estimation. An- modeling. Two trend that is observable from Table 4 is that the data sets repre- studies were inconclusive in favoring a particular technique either sented real world projects which adds to the external validity of because the different measures did not converge, as in Robinson these results. The study results The other six studies were co-authored by similar authors to a indicate that while GP scores higher on one evaluation measure, large extent and the data sets also over-lapped between studies it lags behind on others.

There is also a trade-off between different but these studies contributed in introducing different variations qualitative factors, e. These six studies were in agreement that GP is an effective data sets. There can be dif- with neural networks, k-nearest neighbor, linear regression and lo- ferent reasons related to the experimental design for the gistic regression.

Also GP was used to successfully rank-order soft- inconsistent results across the studies using GP for software CES ware modules in a better way than the ranking done on the basis of estimation. One reason is that the accuracy measures used for eval- lines of code. Also it was shown that numerous enhancements to uation purposes are not near to a standardized use.

While the use the GP algorithm are possible hence improving the evolutionary of pred 0. Another aspect which dif- standard GP algorithm for two studies Khoshgoftaar et al. These different sam- two areas of improvement in these studies: i Increasing the com- pling strategies are also a potential contributing factor in inconsis- parisons with more techniques.

As with the studies on software quality data sets. What is also 3 The data sets in Table 8 are taken at a coarser level, e. Article Quotation Robinson and McIlroy While generally not as good as the results obtained from other methods, the GP results are reasonably accurate but low on coverage Reformat et al. From the values shown in the tables, there is no great superiority of one method versus the others. GP can be used as an alternative to linear regression, or as a complement to it.

The equations have provided similar or better values than the regression equations. In the case of linear relationships, some of the small improvements obtained by GP compared to MLR come at the expense of the simplicity of the equations, but the majority of the linear equations are rediscovered by GP Regolin et al. GP and ANN are valid and promising approaches to size estimation. However, GP presents additional advantages with respect to NN.

The main advantage of using GP is the easy interpretation of result Dolado From the point of view of the capabilities of the two methods, GP achieves better values in the pred 0. However, there are different mechanisms to avoid 1. The impression from these studies and placing limits on the size of the GP trees. These mechanisms is that GP performs superior on one evaluation measure at the should be explored further.

The qualitative scores for GP models are both good and bad. But the last column in Table 7. However, as Table 8 showed, it was another key question is that whether or not we are able to have not clear from four studies which sampling strategies were used a reasonable explanation of the relationship between the vari- for the training and testing sets.

From two of these four studies, ables. If, however, we exclude these four studies from our analysis, GP is still a favorable approach for three out of four studies.

Besides, we solutions is thus important. Table 7 also points 1 and 2 indicate above. We believe that these challenges of- shows that the variety of comparison groups is represented poorly; fer promising future work to undertake for researchers. Validity threats baseline. What is evident from these studies is the following: information given in the primary studies Kitchenham et al. The results of this review resulted in a to- us a complete set. Our study selection procedure Section 2.

Conclusions i. B: Do the study measures allow the research questions to be answered? C: Is the sample representative of the population to which the results will generalize? D: Is there a comparison group? E: Is there an adequate description of the data collection methods? F: Is there a description of the method used to analyze data? G: Was statistical hypothesis testing undertaken? H: Are all study questions answered? J: Are the parameter settings for the algorithms given?

K: Is there a description of the training and testing sets used for the model construction methods? Liu et al. McIlroy et al. Study quality assessment code. This is irrespective of the fact that See Table Abraham, A.

Real time intrusion prediction, detection and prevention For software CES estimation, the study results were inconclu- programs. The main reason Afzal, W. A comparative evaluation of using genetic being that GP optimizes one accuracy measure while degrades oth- programming for predicting fault count data.

In Proceedings of the 3rd ers. A systematic review of search-based testing We were therefore inconclusive in judging whether or not GP is an for non-functional system properties. An example of software system debugging. International The results for software fault prediction and software reliabil- Federation for Information Processing Congress, 71 1 , — An indexed bibliography of genetic programming, Report out of eight studies resulting in GP performing better than neural Series no GP, Department of Information Technology and Industrial Management, University of Vaasa, Finland, last checked: 13 Feb Although four out of these eight studies lacked in some Alfaro-Cid, E.

Genetic of the quality instruments used in Table 10 Appendix A ; still programming for the automatic design of controllers for a surface ship. IEEE three out of the remaining four studies reported results in sup- Transactions on Intelligent Transportation Systems, 9 2 , — Reactive and memory- port of GP. We therefore concluded that the current literature based genetic programming for robot control.

Evolutionary computation 1 — Basic algorithms and operators. Banzhaf, W. Genetic programming — An Based on the results of the primary studies, we can offer the fol- introduction. Morgan Kaufmann Publishers, Inc..

Some of these recommendations refer to Barr, R. Journal of Heuristics, 1 1 , 9— Burgess, C. Can genetic programming improve software effort estimation? Information and Software Technology, 43 14 , — Use public data sets wherever possible. Documents: Advanced Search Include Citations. Authors: Advanced Search Include Citations. Abstract This Thesis addresses the task of feature construction for classification.

Keyphrases genetic programming feature construction feature space original attribute new attribute new one real world data interesting knowledge predictive power genetic program data set decision tree classification technique decision tree non linear combination fitness measure important factor classification algorithm.



0コメント

  • 1000 / 1000