Skip Navigation


Behavioral Ecology Advance Access originally published online on September 1, 2004
Behavioral Ecology 2005 16(1):325; doi:10.1093/beheco/arh145
This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Lay Summary
Right arrow All Versions of this Article:
16/1/325    most recent
arh145v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Colegrave, N.
Right arrow Articles by Ruxton, G. D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Colegrave, N.
Right arrow Articles by Ruxton, G. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Behavioral Ecology vol. 16 no. 1 © International Society for Behavioral Ecology 2005; all rights reserved.

Forum

What hypothesis tests are not: a reply to Johnson

Nick Colegravea and Graeme D. Ruxtonb

a SBS, University of Edinburgh, Ashworth Laboratories, King's Buildings, West Mains Road, Edinburgh, EH9 3JT, UK, and b Division of Environmental and Evolutionary Biology, IBLS, Graham Kerr Building, University of Glasgow, Glasgow, G12 8QQ, UK

Address correspondence to N. Colegrave. E-mail: n.colegrave{at}ed.ac.uk.

Received 28 May 2004; accepted 4 June 2004.

We are sorry that Johnson's pleasure at reading our recent paper (Colegrave and Ruxton, 2003Go) was so short-lived. Johnson (2005)Go is correct that the definition of the P value that we use in the paper is incorrect, and we are grateful to him for correcting this unfortunate slip of the pen. However, while the Pr{hypothesis|data} may differ dramatically from Pr{data|hypothesis}, this has no effect whatsoever on the arguments that we were making. Thus, we would like to emphasise that the important take-home message of our paper, that using confidence intervals to think more about the range of effect sizes that are consistent with the data is more useful than thinking too much about P values and post hoc power analysis, remains unchanged.

Johnson also correctly points out that just considering the range of the confidence interval rather than its position and range can lead to a misinterpretation of the likelihood of the real effect size being small or zero. Indeed, probably the best statistic to quote would be the estimated probability of the effect size being within some user-defined tolerance (d) of zero. For parametric tests this can be obtained very easily from the confidence interval. Assuming that the defined tolerance, is less than the magnitude of the measured effect size (e) this probability is

(1)
where

(2)

(3)
and SE is the standard error on which the confidence interval is based. The probabilities in Equation (1) are obtained from the appropriate cumulative probability distribution (usually, the t-distribution with the appropriate degrees of freedom for small samples, and the normal distribution for large samples).

In the case of the example in our previous paper, this gives a probability of the actual effect size being between –0.1 and +0.1 (i.e., d = 0.1) of 0.09, with a confidence limit of (–0.07, 0.81) (Johnson, 2005Go: Figure 1). For the broader confidence limit of (–0.59, 1.33) (Johnson, 2005Go: Figure 2) this probability becomes 0.12. Thus, we concur with Johnson that the broader confidence interval actually gives more credence to the actual effect size being very small than the narrower interval (although note that the likelihoods he calculates are for a one-tailed rather than a two-tailed hypothesis). The conclusion from Colegrave and Ruxton (2003)Go regarding the maximum effect sizes consistent with these confidence limits remain unchanged. We are grateful to Johnson for drawing our attention to this issue and hope that he will now feel that "...[his] efforts are both appreciated and contributing to the advance of science" (Johnson, 2005Go).


    ACKNOWLEDGEMENTS
 
We thank Sean Nee for comments on this reply.


    REFERENCES
 TOP
 REFERENCES
 
Colegrave N, and Ruxton GD, 2003. Confidence intervals are a more useful compliment to nonsignificant tests than are power calculations. Behav Ecol 14:446–447.[Free Full Text]

Johnson DH, 2005. What hypothesis tests are not: a response to Colegrave and Ruxton. Behav Ecol 16:204–205.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Behav EcolHome page
A. Rashed and T. N. Sherratt
Mimicry in hoverflies (Diptera: Syrphidae): a field test of the competitive mimicry hypothesis
Behav. Ecol., March 1, 2007; 18(2): 337 - 344.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Extract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Lay Summary
Right arrow All Versions of this Article:
16/1/325    most recent
arh145v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Colegrave, N.
Right arrow Articles by Ruxton, G. D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Colegrave, N.
Right arrow Articles by Ruxton, G. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?