#### Abstract

Advances in computers and modeling over the past couple of decades have greatly expanded options for analyzing non-normal data. Prior to the 1990’s, options were largely limited to analysis of variance (ANOVA), either on untransformed data or after applying a variance stabilizing transformation. With or without transformations, this approach depends heavily on the Central Limit Theorem and ANOVA’s robustness. The availability of software such as R’s lme4 package and SAS® PROC GLIMMIX changed the conversation with regard to non-normal data. With expanded options come dilemmas. We have software choices – R and SAS among many others. Models have conditional and marginal formulations. There are GLMMs, GEEs among a host of other acronyms. There are different estimation methods – linearization (e.g. pseudo-likelihood), integral approximation (e.g. quadrature) and Bayesian methods. How do we decide what to use? How much, if any, advantage is there to using GLMMs or GEEs versus more traditional ANOVA-based methods? Stroup (2013) introduced a design-to-model thought exercise called WWFD (What Would Fisher Do). This paper illustrates the use ofWWFD to clarify thinking about plausible probability processes giving rise to data in designed experiments, modeling options for analyzing non-normal data, and how to use the two evaluate small-sample behavior of competing options. Examples with binomial and count data are given. While the examples are not exhaustive, they raise issues and call into question common practice and conventional wisdom regarding non-normal data in agricultural research.

#### Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

#### Recommended Citation

Stroup, W. W.
(2013).
"NON-NORMAL DATA IN AGRICULTURAL EXPERIMENTS,"
*Annual Conference on Applied Statistics in Agriculture*.
http://newprairiepress.org/agstatconference/2013/proceedings/8

NON-NORMAL DATA IN AGRICULTURAL EXPERIMENTS

Advances in computers and modeling over the past couple of decades have greatly expanded options for analyzing non-normal data. Prior to the 1990’s, options were largely limited to analysis of variance (ANOVA), either on untransformed data or after applying a variance stabilizing transformation. With or without transformations, this approach depends heavily on the Central Limit Theorem and ANOVA’s robustness. The availability of software such as R’s lme4 package and SAS® PROC GLIMMIX changed the conversation with regard to non-normal data. With expanded options come dilemmas. We have software choices – R and SAS among many others. Models have conditional and marginal formulations. There are GLMMs, GEEs among a host of other acronyms. There are different estimation methods – linearization (e.g. pseudo-likelihood), integral approximation (e.g. quadrature) and Bayesian methods. How do we decide what to use? How much, if any, advantage is there to using GLMMs or GEEs versus more traditional ANOVA-based methods? Stroup (2013) introduced a design-to-model thought exercise called WWFD (What Would Fisher Do). This paper illustrates the use ofWWFD to clarify thinking about plausible probability processes giving rise to data in designed experiments, modeling options for analyzing non-normal data, and how to use the two evaluate small-sample behavior of competing options. Examples with binomial and count data are given. While the examples are not exhaustive, they raise issues and call into question common practice and conventional wisdom regarding non-normal data in agricultural research.