Abstract

Principal component analysis is a multivariate statistical procedure that can be used to identify factors (correlated subsets of variables) in large data sets. This statistical method appears useful for scientists investigating soil processes, but it has received little attention. Reported applications of principal component analysis share a common fault--subjective, user-specified analytical options apparently are not recognized, for they are not discussed. Reported data sets are often small, have low observations-per-variable ratios, and lack tests of robustness. A large soil data set is used to demonstrate systematic procedures for an optimum rotated principal component solution. This solution retained 21 variables aligned among four "clean" and "logical" factors, and extracted 79% of the variance. Robustness was confirmed by comparison with common factor analysis solutions. When carefully applied, the presented guidelines should enhance scientists' abilities to identify and transfer knowledge about multivariate data sets, and should allow different scientists to independently arrive at similar factor solutions.

Keywords

factor analysis, communality, variance, robust

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Share

COinS
 
Apr 26th, 1:00 PM

APPLYING PRINCIPAL COMPONENT ANALYSIS TO SOIL-LANDSCAPE RESEARCH-QUANTIFYING THE SUBJECTIVE

Principal component analysis is a multivariate statistical procedure that can be used to identify factors (correlated subsets of variables) in large data sets. This statistical method appears useful for scientists investigating soil processes, but it has received little attention. Reported applications of principal component analysis share a common fault--subjective, user-specified analytical options apparently are not recognized, for they are not discussed. Reported data sets are often small, have low observations-per-variable ratios, and lack tests of robustness. A large soil data set is used to demonstrate systematic procedures for an optimum rotated principal component solution. This solution retained 21 variables aligned among four "clean" and "logical" factors, and extracted 79% of the variance. Robustness was confirmed by comparison with common factor analysis solutions. When carefully applied, the presented guidelines should enhance scientists' abilities to identify and transfer knowledge about multivariate data sets, and should allow different scientists to independently arrive at similar factor solutions.