Psych Practice: standard deviation

Thursday, January 14, 2016

DIY Study Evaluation

If you have any interest at all in being able to evaluate the results of clinical trials on your own, say because you don't trust what the pharmaceutical companies are telling you, then I HIGHLY recommend you head on over to 1 Boring Old Man and read through his posts from the last few weeks. Basically, he's writing a statistics manual for clinicians, complete with downloadable spreadsheets of his own devising.

His explanations are clear, but I wanted to make sure I could do this on my own, so I tried it out. Here's how it worked.

I would categorize myself as a fairly conservative prescriber, by which I mean that I'm not eager to jump on the new drug bandwagon, and I like to wait a year or two, until we know a little about the effects and side effects of a new drug, before I write for it. I also wait a few weeks before upgrading my iOS for the same reason, so there ya go. But I recently had occasion to prescribe the antidepressant, Brintellix, or vortioxetine. I can't get into the clinical details, but suffice it to say there were reasons. So with Brintellix on my mind, I decided to try out the 1 Boring Old Man spreadsheet on one of their studies that I found on clinicaltrials.gov, specifically, Efficacy Study of Vortioxetine (LuAA 21004) in Adults with Major Depressive Disorder, the results of which were submitted to clinicaltrials.gov in October 2013.

From the get-go, it's looks like a poor study. There were 50 study sites scattered all over Asia, Europe, Australia, and Africa, and it looks like they did something to the outcome measures midstream. But I'm just trying out the spreadsheet, so I'm ignoring all that for now.

The primary outcome measure was change in HAM-D score, which means that I needed to use the spreadsheet for continuous variables, because mean change could have been any number. If the measure was, "Achieved remission," however they define, "remission," then the results would be tabulated in Yes/No form, and I would have had to use a different spreadsheet designed for categorical variables.

But let me pause here and ask a question: Just what am I looking for? Well, I'm looking for effect size, which generally isn't given in results. Usually, we just get to see p-values, but I'll get to why that's not sufficient in a later post.

As a reminder, effect size is the difference between treatment groups, expressed in standard deviations. Roughly speaking, a large effect size is 0.8, medium is 0.5, and small is 0.2. So, for example, if the effect size of A v. B is 0.8, then A did 0.8 of a standard deviation better than B, and this is considered a large effect. So if I know the effect size, then I can tell how much better one group did than another. I can quantify the difference between groups. Cohen's d is often used as a measure of effect size.

It turns out that you only need three pieces of information to determine effect size, all generally available in typical papers. For each arm of the study, you need the number of subjects in that arm, the mean, and the standard error of measure (SEM) or standard deviation, which are interchangeable via the formula (sorry, I don't have Greek letters in my font):

That's it: n, mean, SEM.

Here is that information from the study report. Note that there were four arms: Placebo; Vortioxetine 1mg, 5mg, and 10mg.

Let's plug 'em all in to the 1BOM spreadsheet, while noting that I'm not including the ANOVA, which you really need to do first, to make sure the four groups aren't all the same in comparison to each other, because if they are, then any result you get when you compare 1 group directly with 1 other group is invalid. Just so you know, I computed the ANOVA using this calculator, also recommended by 1BOM, which requires exactly the same information as you need to compute effect sizes, and it turns out that the groups are NOT all the same (this is another thing related to p value, which I plan to discuss in a later post).

The top three rows show the effect sizes for the three active arms, compared with placebo. Note that the effect sizes are in the moderate range, 0.423 to 0.591.

In the next three rows, I also checked to see how the active arms compared with each other in a pairwise fashion, and the 10mg really doesn't do much better than the 5mg or even the 1mg, with 0.170 the largest effect size.

Just considering effect sizes in this one study, Brintellix looks okay.

So you can see that there are powerful things you can do in the privacy of your home, to understand what a study is really telling you, using only minimal information. That feels pretty good. At the same time, you have to take into account other elements, like the fact that they seem to have changed outcome measures after the protocol was already established. That should invalidate the whole kit and kaboodle, but sometimes you need to try out a new drug, and the studies aren't great, but it's the best you can do.

Sunday, July 28, 2013

Statistically Writing-Variance and Standard Deviation

I hope people weren't too annoyed by my previous statistics post about measures of central tendency. But it's important to really understand the concept of a mean, and its implications for research, before moving on to bigger and better things. This time around, we're going to look at measures of dispersion.

Say you have a set of data points, and you've figured out the mean for that set. You might, then, want to know how far from the mean each of your data points is. So if you subtract each data point from the mean, and take the absolute value, you would know that information, for each point.

Consider teenagers. You have a group of 5 teens, and each spends a certain number of hours per day on Facebook:

T1=3; T2= 5; T3=2; T4=6; T5=2

If you calculate the mean here, you get: 3.6. So, on average, each teen spends 3.6 hours per day on Facebook.

Now suppose you want to know how close or far from average each kid's time on Facebook is (Why? To see if your kid is a freak):

T1: |3-3.6|= 0.6
T2: |5-3.6|= 1.4
T3: |2-3.6|= 1.6
T3: |6-3.6|= 2.4
T5: |2-3.6|= 1.6

Well, that's nice, but notice, you have another data set here, for which you can also find the mean. This is called the Mean Absolute Deviation. In this case, it's equal to 1.52 hours.

But Mean Absolute Deviation is not Variance. Variance, denoted by sigma squared, is actually the sum of the squares of each of these numbers, averaged out:

So here, the Variance = (0.6² + 1.4² + 1.6² + 2.4² + 1.6²)/5 = 5.72.

You may recall from my last stats post that I wrote about the distinction between the sample mean and the population mean. In the above example, the 5 teens constitute our entire population, and the formula above is for population variance, denoted by sigma squared. (Also note that population mean is denoted by mu.)

But let's say you want to use this group of 5 teens to estimate the average number of hours on Facebook for all teens in the US. Then the group of 5 teens is a sample. And weird as this may sound, a better way to estimate the variance of a population based on a sample is to calculate the "unbiased sample variance", denoted by s squared, where the result is computed by dividing by n-1 rather than by n.

In this case, the unbiased sample variance = 7.15.

Variance is a useful measure of how far from the mean the data points are. But notice, it's a squared value. This implies that the distance from the mean is exaggeratedly large. Just looking at the variance, without units, you can see that 7.15 is bigger than the greatest amount of time spent on Facebook, 6 hours.

And if you have outliers, say some weird kid was on Facebook 20 hours a day, the variance will be huge. For those readers who thought my last statistics post was overly simplistic, this is where it starts to be important to know which measures are good for data with outliers, and which aren't.

Also, if your data is measured in hours, it's unintuitive to think about distance from the mean in hours squared. This is where Standard Deviation comes in handy.

Standard Deviation is nothing but the square root of variance:

For an entire population,