SRGM with Imperfect Debugging using Capability Analysis of Log-Logistic Model

Analyze the predict capability of some SRGMs to understand the different parameters to facilitate the estimate process. The predict validity analysis will be on two key factors, one pertaining to the degree of fitment on available failure data and the other for its prediction capability. The validity analysis will be to arrive at trade off in choosing a simple model as compared to complex model by determining their performances across multiple data sets. Data for the predict validity analysis has been taken from different time periods to understand the impact of these models across various technologies and process during the time frame.


INTRODUCTION
There are many Software Reliability Models available. Each model use two are three parameters to get the reliability estimation from the actual failure data. These models are designed based on the expectation of the trend in the failure data .To identify the trends some assumptions are assumed and they are:  Fault identification is rapid in initial stage and reaches a steady state.  Fault identification is slow initially and rapidly increases, finally reaches to a steady state.  Fault identification at the initial stage follows steady growth and reaches to steady state. Each trend depends on the following issues completely or partially  Performance capability of testing team  Complexity of application domain  Kind of technology used  Size of the application  The kind of software development process The number of faults identified by an experienced tester is more than the less experience tester. The complexity of the application and the time constraint to identify the faults may create new faults or ignore existing faults.
New technologies introduced changes in the software development process SSAD (Structured System analysis and Design) technique was used in early times .With the advent of distributed objects OOAD (Object Oriented Analysis and Design) technique used. With new changes, there is change in testing and debugging process as compared to SSAD development process. With internet technology new systems and legacy systems can be integrated, studying such systems where both old and new system coexists is interesting.
Different software reliability growth models have been developed across various time periods. The complexity of models with two or more parameters increased to cater to change in need with change in time. In this paper we focus on validating predictions of two or more parameters software reliability growth models. Along with predictions done by Prince [1](2006), we present the Log -Logistic model for the both ungrouped and grouped data inputs taken from the early 80's and 90's.The degree of the fitment is studied for the loglogistic data using a specific percentage available failure data and validate the predictive performance of the model along with the models validated by Prince [1](2006).

Software Reliability Growth Models (SRGMS)
Along with six different software reliability growth models Prince [1](2006). , log-logistic model, a total of seven was selected for performing the predict capability analysis .The common assumptions made in these models are:  The software system is subject to failure are random time caused by software faults. Faults identified are immediately corrected and will not appear again

Two parameter models:
The following three SRGM's each having a specific characteristic and designed using two parameters were selected for the prediction capability analysis:

Delayed s-shaped growth model:
The mean value function for this model (Yamada et al., 1983) [2] is given below: The advantage of this model is that it is designed for fault isolation data analysis. Fault isolation means some of the failures can be intentionally reproduced, so the fault identification and its removal can be achieved.

Exponential growth model:
The model proposed by Goel and Okumoto (1979) [3] has been taken for the study. The failure data is assumed to take an exponential curve. The mean value function for this model is:

Logarithmic Poisson model:
This model proposed by Musa and Okumoto (1984) [4], has its mean value function as: This model is classified as infinite failure model as it is assumed that there is no upper bound to the number of failures.

Three parameter models:
The following SRGM's designed using three parameters were selected for the performance analysis:

Imperfect debugging model:
Imperfect debugging occurs when the error debugging process does not lead to the removal of the error. The model proposed by Kapur et al., (1980) [5] has been taken. The mean value function for this model is: Where p is the probability of perfect debugging.

Inflection s-shaped growth model:
According to this model the observed software reliability growth becomes S-shaped if faults in a program are mutually dependent .The mean value function for this model (Obha 1984a) is: where c is the inflection parameter and is given by , where r is inflection rate.

Logistic growth model:
In this popular model (Yamada et al., 1983) the software failures are assumed to follow a logistic curve. The mean value function for this model is [6] :

Log-logistic model:
In this model the software failures are assumed to follow a Loglogistic curve .The mean value function of this model is given [7] by: Where k is the inflection parameter.

Prediction Capability Approach
The prediction capability of each model is analyzed using two data sets. Each data is of different size and from different time periods.
The predict validity process consists of the following steps (Prince 2006):  Estimation of the parameters of each model using 80% of the failure data  Model analysis: 1. Goodness of Fit for all models using the first 80% of the failure data. 2. Model prediction capability is compared by validating against the last 20% of the available failure data.

Parameter Estimation
The model parameters for each of the selected model were estimated by maximizing the log likelihood function (Obha 1984a) and it is given by: where z i is the cumulated number of faults before time t i , n is the interval domain size.
The respective mean value function of each model is substituted in the above equations. Then the result is differentiated with respect to the number of parameters to obtain the specific parameter based equations. These equations are set to zero and solved to arrive at the respective parameter values [8].
The details of the parameters estimated for each Model using published failure data sets are given in the tables 1 and 2

Model analysis
The performance of the each model was analyzed by Goodness Fit and Predictive capability of each model. For deriving the degree of fitness first 80% of the failure data was used .The remaining 20% of failure data was then predicted using the estimated parameters. The validation fitness and prediction capability of each model was measured by their SSE (Sum of Square Errors) calculations.

Conclusions
In this paper, we have presented different models, based on the two or three parameters for expectation of the trend in the failure data. Three parameter inflection s-shape model has the best fit amongst all the models analyzed and Log-logistic model shows the best overall failure data prediction capability performance. Among two parameters delayed s-shaped can be considered as best.