Statistical Errors in Software Engineering Experiments: A Preliminary Literature Review

EasyChair Preprint 188, version 2

Versions: 12→history

12 pages•Date: July 4, 2018

Rolando Reyes, Oscar Dieste, Efraín R. Fonseca C. and Natalia Juristo

Abstract

Background: Statistical concepts and techniques are often applied incorrectly, even in mature disciplines such as medicine or psychology. Surprisingly, there are very few works that study statistical problems in software engineering (SE).

Aim: Assess the existence of statistical errors in SE experiments.

Method: Compile the most common statistical errors in experimental disciplines. Survey experiments published in ICSE to assess whether errors occur in high quality SE publications.

Results: The same errors as identified in others disciplines were found in ICSE experiments, where 30% of the reviewed papers included several error types such as: a) missing statistical hypotheses, b) missing sample size calculation, c) failure to assess statistical test assumptions, and d) uncorrected multiple testing. This rather large error rate is greater for research papers where experiments are confined to the validation section. The origin of the errors can be traced back to: a) researchers not having sufficient statistical training, and, b) a profusion of exploratory research.

Conclusions: This paper provides preliminary evidence that SE research suffers from the same statistical problems as other experimental disciplines. However, the SE community appears to be unaware of any shortcomings in its experiments, whereas other disciplines work hard to avoid these threats. Further research is necessary to find the underlying causes and set up corrective measures, but there are some potentially effective actions and are a priori easy to implement: a) improve the statistical training of SE researchers, and b) enforce quality assessment and reporting guidelines in SE publications.

Keyphrases: Prevalence, Statistical Errors, literature review, survey

Links:	https://easychair.org/publications/preprint/rntz
	https://doi.org/10.29007/964b

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:188,
  author    = {Rolando Reyes and Oscar Dieste and Efraín R. Fonseca C. and Natalia Juristo},
  title     = {Statistical Errors in Software Engineering Experiments: A Preliminary Literature Review},
  doi       = {10.29007/964b},
  howpublished = {EasyChair Preprint 188},
  year      = {EasyChair, 2018}}

Download PDF Open PDF in browser