Today was a very one-sided conversation that I had with myself and maybe this will be the way it is given the subject. Remember also that there the assigned readings are not the only readings. Feel free to use these as a springboard to explore other papers cited in these papers or papers that cites these or googled papers on similar ideas.
1. Someone asked in what situations is it appropriate to use alpha<>0.05 and I gave a weak answer. A much better answer has to consider the two types of error in stastical inference:
type I: finding significance when none exists (more technically: rejection of the null hypothesis when the null hypothesis is true)
type II: failure to find significance when it exists (more technically: failure to reject the null hypothesis when the null hypothesis is false)
alpha is the rate of type I error. Beta is the rate of type II error. Power is (1 - beta). With the question "do faster starts increase the probability of evasion" it is not a big deal if I fail to find a significant effect even if it exists (that is, if I have a high beta or type II error rate). That is there are not really "costs" to type II errors (of course there aren't really costs to type I errors either, more on this later). But with some questions, there are costs to type I and type II errors. For example, given some drug there is a cost to not taking it (risk of high blood pressure) and a cost to taking it (risk of seizure). So the experimenter should evaluate relative cost of type I vs. II error and set the alpha and beta necessary to minimize the weighted costs. See "liberating alpha" section inthis paper for a more detailed explanation.
2. Can we ever conclude "given this post hoc power analysis, I had enough power to reject the null hypothesis but I failed to reject the null hypothesis so therefore I am going to conclude the null hypothesis is true and there is no treatment effect (e.g. faster starts do not matter)." No we can never come to this conclusion. Either the P-value was significant at the specified alpha (usually 0.05) or the post hoc power analysis can only tell us that we need to collect more data.
3. But, post-hoc power analysis can tell us how much more data are needed to find an effect so these are useful. But then what we thought was a completed study turns out to be a preliminary study that was necessary to find the parameters needed to do a power analysis that can be used to design the experiment properly!
4. But we can always measure a big enough N to reject the null hypothesis so can we ever apply meaning to the statistics? Yes, by emphasizing the size of the effect and not the p-value. I have now convinced myself that a lot of effort needs to go into thinking about what is the minimum biologically meaningful effect size and that this can be done although, as someone stated in class, it will differ from problem to problem. Once we have a minimum biologically meaningful effect size (MBMES), then we can compare the measured effect size to the MBMES) and if the measured effect size is smaller then we should conclude that there is no biologically important effect, regardless of the P-value.
5. I should have had everyone read this paper, in addition to the other two. This paper is short, very readable, and full of sage advice
Some important points in this paper:
1. Page F22, 2nd column, near top. Use of phrases "very significant" and "highly significant" are not necessarily related to biology since the P-Value is a function of the effect size, the variance, and the sample size. Very low or "highly significant" p-values (say, P<0.001) only gives one high confidence that the null model is false. But remember, this is probably strictly true for anything that we test, so its not the P-value that should guide our discussion but the magnitude of the effect (the effect size) or something related to it, like the R^2.
2. Page F24, 1st column, near bottom. Report and interpret R^2 values, even in ANOVAs and other non-regressions. This value gives us some idea of the relative magnitude of the effect. Read about R^2 values in other papers and get to know these. They are not emphasized enough.
3. I liked the last example about the male vs. female food choice on the last page.