Author Information

Bryan F.J. Manly

Abstract

In this paper I discuss three topics that I believe are relevant to the subject of statistics in the new millennium: (a) the impact of computers, and the state of computer-intensive methods as far as practical applications of statistics are concerned; (b) methods for the analysis of the extremely large data sets that are now becoming available; and (c) the use of statistics by scientists in general. For the first topic I suggest that the main advantage of computer-intensive methods is that they can under certain circumstances give simple and believable answers to questions when other methods fail. However, I caution against the uncritical use of computer power without proper checks that analyses work, particularly when conclusions are dependent on very complicated models with many assumptions that are difficult or impossible to verify. For the second topic I note that statistics grew up as a means of extracting the maximum amount of information from small sets of data, and we are now having some difficulty in adapting methods to huge data sets because sometimes the analyses that we might want to do are not possible even with today's powerful computers. I discuss this particularly in terms of the analysis of resource selection data by animals where geographical information system data are available to describe what is available for animals to use. For the third topic I suggest that statistics and statisticians have something of an 'image' problem with scientists in general. Many scientists do not appear to regard statistics as important for their discipline, and yet errors in the analysis and interpretation of data seem to be fairly common in the scientific literature.

Keywords

Bootstrapping; Computer-intensive statistics; Extremely large data sets; Geographical Information Systems; Randomization test

Creative Commons License


This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Share

COinS
 
Apr 25th, 8:10 AM

STATISTICS IN THE NEW MILLENNIUM: SOME PERSONAL VIEWS

In this paper I discuss three topics that I believe are relevant to the subject of statistics in the new millennium: (a) the impact of computers, and the state of computer-intensive methods as far as practical applications of statistics are concerned; (b) methods for the analysis of the extremely large data sets that are now becoming available; and (c) the use of statistics by scientists in general. For the first topic I suggest that the main advantage of computer-intensive methods is that they can under certain circumstances give simple and believable answers to questions when other methods fail. However, I caution against the uncritical use of computer power without proper checks that analyses work, particularly when conclusions are dependent on very complicated models with many assumptions that are difficult or impossible to verify. For the second topic I note that statistics grew up as a means of extracting the maximum amount of information from small sets of data, and we are now having some difficulty in adapting methods to huge data sets because sometimes the analyses that we might want to do are not possible even with today's powerful computers. I discuss this particularly in terms of the analysis of resource selection data by animals where geographical information system data are available to describe what is available for animals to use. For the third topic I suggest that statistics and statisticians have something of an 'image' problem with scientists in general. Many scientists do not appear to regard statistics as important for their discipline, and yet errors in the analysis and interpretation of data seem to be fairly common in the scientific literature.