Information and Software Technology 51 (11): 1573-1585 (2009)

Is Non-Parametric Hypothesis Testing Model Robust for Statistical Fault Localization? 1

Zhenyu Zhang 2 , W.K. Chan 3 , T.H. Tse 4 , Peifeng Hu 5 , and Xinming Wang 6

[paper from ScienceDirect | technical report TR-2009-11]


Fault localization is the most difficult activity in software debugging. Many existing statistical fault-localization techniques estimate the fault positions of programs by comparing the program feature spectra between passed runs and failed runs. Some existing approaches develop estimation formulas based on mean values of the underlying program feature spectra and their distributions alike. Our previous work advocates the use of a non-parametric approach in estimation formulas to pinpoint fault-relevant positions. It is worthy of further study to resolve the two schools of thought by examining the fundamental, underlying properties of distributions related to fault localization. In particular, we ask: Can the feature spectra of program elements be safely considered as normal distributions so that parametric techniques can be soundly and powerfully applied? In this paper, we empirically investigate this question from the program predicate perspective. We conduct an experimental study based on the Siemens suite of programs. We examine the degree of normality on the distributions of evaluation biases of the predicates, and obtain three major results from the study. First, almost all examined distributions of evaluation biases are either normal or far from normal, but not in between. Second, the most fault-relevant predicates are less likely to exhibit normal distributions in terms of evaluation biases than other predicates. Our results show that normality is not common as far as evaluation bias can represent. Furthermore, the effectiveness of our non-parametric predicate-based fault-localization technique weakly correlates with the distributions of evaluation biases, making the technique robust to this type of uncertainty in the underlying program spectra.

Keywords: Fault localization, non-parametric, hypothesis testing, normality

1. This research is supported in part by GRF grants of the Research Grants Council of Hong Kong (project nos. 123207 and 716507) and SRG grants of City University of Hong Kong (project nos. 7002324 and 7002464).
2. Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong.
3. Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Hong Kong.
4. (Corresponding author.)
Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong.
5. China Merchants Bank, Central, Hong Kong.
6. Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong.


  Cumulative visitor count