Wednesday, November 27, 2013

A Quick (And Probably Pedantic) Lesson On Correctly Interpreting Poll Numbers

On Twitter, @sqwerin noticed the following regarding Republican approvale of Obama in the recent Quinnipiac poll of Ohio:

This seems to be a good time to make a few points about polls in general regarding the margin of error, at least one of which I don't think is commonly understood. But to get to a way to make that point more intuitively obvious, I have to start with a few other things. Please note that to some extent, this is pedantic since it involves considering certain reported results as being more precise than the reported margin of error. There really is no "harm" in thinking of things as being less precise (although, one of these points speaks to where there is less precision, so that part is a bit more important; that one is more commonly understood though).

But pedant is my middle name, so let's hop to it.

First, when a poll has breakouts among different subgroupings, the margin of error for those results is bigger than the margin of error for the entire survey. Using the poll linked above as an example, the overall sample has, according to Quinnipiac, a margin of error of +/- 2.7 percentage points. That is based on 1,361 registered voters in a state with ~9 million.

The number of Republicans, from which they measured the 3% approval referenced above, in the sample is obviously a fraction of the 1,361. This increases the margin of error. The universe of Republican registered voters is similarly much smaller than 9 million, and while that works to decrease the margin of error, it does so much less than the former does. To show this, let's assume that there are 450 Republicans in the sample but consider them pulled from a pool of 9 million. One can use this margin of error calculator, so as to not need to do the actual math, and see the margin of error is closer to +/- 5% than 2.7%. Change the universe to 3 million, and it does not move.

I believe that is fairly commonly understood.

Second, look at the results posted in the survey details. Everything is reported to the nearest whole integer. Reporting a margin of error to the first decimal point while using whole integers in the results does not make a lot of sense. The precision that would allow the margin of error to be 2.7% gets lost in the rounding to the nearest integer, and therefore it should be reported as +/- 3%.

I think that is not as commonly understood as the first point, but more commonly than the next one.

The margin of error in a survey is in reference to 50%, and the further away from that the results are, the tighter the margin of error is. Consider my assumption above, which resulted in a margin of error of +/- 5% for the Republican subgrouping. It should be obvious that there is no way for the 3% measured to really be 5 percentage points lower, since it is bounded at the bottom by 0%. What may not be as obvious, though, is that the plus side is also not as high as 5%. In fact, the further away from 50% the measured number is, the tighter the margin of error becomes. The reported number is more precise than the calculated margin of error. When the number is this close to zero, the margin of error is very small, even within a sample as small as the subsample of Republicans.

The last point does not really help much in interpreting the approval rating among Republicans, in that most people would get the implications of the reported number: there is little downside for Obama among Republicans in Ohio. But it has more relevance for the 34% in the full sample who approve of Obama's performance. I haven't done the math to figure out how much more precise the measurement becomes at 34%, but suspect that it would move the 2.7% reported margin of error to below 2.5%, which as mentioned above with the rounding effect above would more accurately be considered +/-2%.

When a survey has a sample size as large as Quinnipiac used here, none of this matters much. However, all three of these points can have their utility when looking at particular polls, especially ones where the sample is small.

No comments:

Post a Comment

By and large I am going to rely on Twitter to be the 'comments' section here. You can submit comments, but moderation is enabled, and nearly all of the time I am not even going to check the moderation queue (although in some circumstances, I just might).