Abstract

Normality assumption of multivariate data is a prerequisite to the use of multivariate statistical data analysis methods before inference could be valid and reliable. Tests developed to validate this assumption including Doornik-Harsen (DH), Shapiro-Francia (SF), Mardia Skewness (MS), Mardia Skewness for small sample (MSS) and Kurtosis (MK), Skewness (S) and Kurtosis(K), Shapiro-Wilk(SW), Royston (R), Desgagne-Micheaux (DM), Henze-Zirkler (HZ), Energy (E), Gel-Gastwirth (GG) and Bontemps-Meddahi (BM) tests often result into different conclusions. These differences can be misleading. Consequently, this paper examined the effect of correlations on the Type 1 error rates of multivariate tests of normality. Monte Carlo experiments were conducted one thousand (1000) times taking into consideration the dimensions, correlations and sample sizes of the multivariate data. A test is affected by correlation if its estimated Type 1 error rate changes as correlation changes. A test is considered good if its estimated error rate approximates the true error rate and best if the number of times it approximates the estimated error rate when counted over the levels of correlations, sample sizes and levels of significance is the highest, the mode. Results show that Type 1 error rates of DH, SF, SW, R, DM, GG and BM tests are affected by correlations and are relatively not good; where as the Type 1 error rates of HZ, MS, MK, MSS, S, K and E tests are not only unaffected by correlations but are also relatively good. Consequently, MS, R, MSS, HZ and E tests have good Type 1 error rates but that of E and HZ tests are best. They are therefore recommended for practitioners.

Keywords

Correlation, Level of Significance, Type 1 error rate, Multivariate Normality test

Creative Commons License

Creative Commons Attribution-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-No Derivative Works 3.0 License.

Share

COinS
 
Jan 1st, 12:00 AM

Effect of Correlations on Type 1 Error Rates of Some Multivariate Normality Tests

Normality assumption of multivariate data is a prerequisite to the use of multivariate statistical data analysis methods before inference could be valid and reliable. Tests developed to validate this assumption including Doornik-Harsen (DH), Shapiro-Francia (SF), Mardia Skewness (MS), Mardia Skewness for small sample (MSS) and Kurtosis (MK), Skewness (S) and Kurtosis(K), Shapiro-Wilk(SW), Royston (R), Desgagne-Micheaux (DM), Henze-Zirkler (HZ), Energy (E), Gel-Gastwirth (GG) and Bontemps-Meddahi (BM) tests often result into different conclusions. These differences can be misleading. Consequently, this paper examined the effect of correlations on the Type 1 error rates of multivariate tests of normality. Monte Carlo experiments were conducted one thousand (1000) times taking into consideration the dimensions, correlations and sample sizes of the multivariate data. A test is affected by correlation if its estimated Type 1 error rate changes as correlation changes. A test is considered good if its estimated error rate approximates the true error rate and best if the number of times it approximates the estimated error rate when counted over the levels of correlations, sample sizes and levels of significance is the highest, the mode. Results show that Type 1 error rates of DH, SF, SW, R, DM, GG and BM tests are affected by correlations and are relatively not good; where as the Type 1 error rates of HZ, MS, MK, MSS, S, K and E tests are not only unaffected by correlations but are also relatively good. Consequently, MS, R, MSS, HZ and E tests have good Type 1 error rates but that of E and HZ tests are best. They are therefore recommended for practitioners.