Friday, July 29, 2016

Stop saying confidence intervals are "better" than p values

One of the common tropes one hears from advocates of confidence intervals is that they are superior, or should be preferred, to p values. In our paper "The Fallacy of Placing Confidence in Confidence Intervals", we outlined a number of interpretation problems in confidence interval theory. We did this from a mostly Bayesian perspective, but in the second section was an example that showed why, from a frequentist perspective, confidence intervals can fail. However, many people missed this because they assumed that the paper was all Bayesian advocacy. The purpose of this blog post is to expand on the frequentist example that many people missed; one doesn't have to be a Bayesian to see that confidence intervals can be less interpretable than the p values they are supposed to replace. Andrew Gelman briefly made this point previously, but I want to expand on it so that people (hopefully) more clearly understand the point.

Tuesday, May 3, 2016

Numerical pitfalls in computing variance

One of the most common tasks in statistical computing is computation of sample variance. This would seem to be straightforward; there are a number of algebraically equivalent ways of representing the sum of squares \(S\), such as \[ S = \sum_{k=1}^n ( x_k - \bar{x})^2 \] or \[ S = \sum_{k=1}^n x_k^2 + \frac{1}{n}\bar{x}^2 \] and the sample variance is simply \(S/(n-1)\).

What is straightforward algebraically, however, is sometimes not so straightforward in the floating-point arithmetic used by computers. Computers cannot represent numbers to infinite precision, and arithmetic operations can affect the precision of floating-point numbers in unexpected ways.


Sunday, April 3, 2016

How to train undergraduate psychologists to be post hoc BS generators

Teaching undergraduate psychology is difficult for a variety of reasons. Students come in with preconceived notions about what psychological research is and are sometimes disappointed with the mismatch between their preconceptions and reality. Much of what psychologists do is highly specialized and requires skills that are difficult to teach, and psychologists-in-training can't offer much research-wise until they have years of experience. The assignments we ask undergraduates to complete are meant to train their critical thinking skills to prepare them for a more substantive contribution to research. Sometimes, however, they do exactly the opposite; instead, assignments can reward post hoc BS generation rather than actual critical thinking.



Wednesday, March 30, 2016

How to check Likert scale summaries for plausibility

Suppose you are reading a paper that uses Likert scale responses. The paper reports the mean, standard deviation, and number of responses. If we are -- for some reason -- suspicious of a paper, we might ask, "Are these summary statistics possible for this number of responses, for this Likert scale?" Someone asked me this recently, so I wrote some simple code to help check. In this blog post, I outline how the code works.

Saturday, January 9, 2016

Asymmetric funnel plots without publication bias

In my last post about standardized effect sizes, I showed how averaging across trials before computing standardized effect sizes such as partial \(\eta^2\) and Cohen's d can produce arbitrary estimates of those quantities. This has drastic implications for meta-analysis, but also for the interpretations of these effect sizes.  In this post, I use the same facts to show how one can obtain asymmetric funnel plots — commonly taken to indicate publication bias — without any publication bias at all. You should read the previous post if you haven't already.

Thursday, January 7, 2016

Averaging can produce misleading standardized effect sizes

Recently, there have been many calls for a focus on effect sizes in psychological research. In this post, I discuss how naively using standardized effect sizes with averaged data can be misleading. This is particularly problematic for meta-analysis, where differences in number of trials across studies could lead to very misleading results.