Information and Software Technology 51 (11): 1573-1585 (2009) |
Zhenyu Zhang 2 , W.K. Chan 3 , T.H. Tse 4 , Peifeng Hu 5 , and Xinming Wang 6
ABSTRACT |
Fault localization is the most difficult activity in software debugging.
Many existing statistical fault-localization
techniques estimate the fault positions of programs by comparing the
program feature spectra between passed runs and failed runs.
Some existing approaches develop estimation formulas based on mean values
of the underlying program feature spectra and their distributions alike.
Our previous work advocates the use of a non-parametric approach in estimation
formulas to pinpoint fault-relevant positions.
It is worthy of further study to resolve the two schools of thought
by examining the fundamental, underlying properties of distributions
related to fault localization.
In particular, we ask: Can the feature spectra of program elements
be safely considered as normal distributions so that parametric techniques
can be soundly and powerfully applied?
In this paper, we empirically investigate this question from the program
predicate perspective.
We conduct an experimental study based on the Siemens suite of programs.
We examine the degree of normality on the distributions of evaluation
biases of the predicates, and obtain three major results from the study.
First, almost all examined distributions of evaluation biases are
either normal or far from normal, but not in between.
Second, the most fault-relevant predicates are less likely to exhibit
normal distributions in terms of evaluation biases than other predicates.
Our results show that normality is not common as far as evaluation bias can
represent.
Furthermore, the effectiveness of our non-parametric predicate-based
fault-localization technique weakly correlates with the distributions
of evaluation biases, making the technique robust to this type of
uncertainty in the underlying program spectra.
Keywords: Fault localization, non-parametric, hypothesis testing, normality |
|
EVERY VISITOR COUNTS: |