This phrase is often attributed to Mark Twain though that is disputed.
Statistics is a very powerful science/math. It allows you to see things that are not always obvious at a glance. It is also sort of complicated. Because I’m no longer a math nerd, having become a computer nerd this is my goto book whenever I have to actually do statistics. The Cartoon Guide to Statistics
My favorite story from the book, that I repeat often, is the story of a small post graduate business school. As part of their recruiting drive they reported that the average salary for somebody graduating with an MBA from their school was well above $150k.
This sounded wonderful. What they failed to mention was that they got this mean by adding up the first year salaries of all their graduates and dividing by the number of graduates, including the one graduate that went into the NBA with a multi million dollar first year salary. With the small size of the graduating class that salary drove the mean(average) way higher than the median.
The school didn’t tell a lie, they didn’t tell a damn lie, they just used statistics to lie for them.
Almost all measurements of natural phenomena fall into what is called “The Bell Curve”. The bell curve is defined as by the mean (sum of the samples divided by the number of samples) and standard deviation. The larger the standard deviation the wider the bell, the smaller the standard deviation the narrower the bell is. Small SD have steep sides.
As shooters we can see this when we take samples of the muzzle velocity of a round. You collect enough samples and you will be able to calculate the mean and standard deviation. If the standard deviation is small then you know that most of your rounds have nearly the same velocity. If the SD is larger there is more fluctuation in the velocity and that will affect accuracy.
When you are looking at verifying a result as being significant and not just “noise” you want that result to be two standard deviations away from the mean of your comparison.
So if you have a population of 200 subjects and you give 100 of them a placebo and 100 of them a new medicine you and you want to compare the results you calculate the mean and SD of both groups after your tests. Let’s say the value you are collecting is weight loss or gain. If the placebo group loses 10 pounds on average with a SD of 3 you have a base line. If the other group loses an average of 13 pounds that sounds like it is “good” and it is, but it is not significant. The group would need to lose an average of 16 pounds (10+2*3) for it to be considered significant.
There is another issue in the samples, that is the outlier. Our example of first year salaries is an example of outliers. What if our medicated group had one or two people that managed to lose not 16 pounds but 24 and 26 pounds? In a sample size of 100 that moves the mean by 0.2 pounds. That could easily be the difference between meeting the two standard deviation threshold and not.
So we “throw out” the outliers. This can give you much better results.
And here we start to see some of the complications, when do you decide that a value is actually an outlier.
How to lie with statistics
The first way to lie with statistics is to measure the wrong thing. In gun rights we see this all of the time. When comparing crime statistics we will often time only hear about violent crimes involving guns. Is this the right measurement? There is a strong argument that it is not, the correct argument is violent crimes. Thus we see in the UK that their violent crime rate with guns has dropped since they banned guns but their violent crime rate overall has not dropped.
In the same way it is important how different groups define what is being measured. Some commenter explained to me that in the UK it is only a murder if somebody is convicted of the crime of murder. If there is no conviction then it wasn’t murder. In the United States murder is defined such that it does not require a conviction. If you were to compare murders reported in the UK v. murders reported in the US you are not comparing the same thing.
It is also the case when you are comparing internal values. A good example is the consumer price index. From memory, there was a point in time when the CPI was going up faster than the government wanted it to. The CPI was based on a “basket of goods and services”. Included in that basket was steak and other high quality goods that were purchased on a regular basis. The government decided that the basket was no longer representative because the cost of steak had gone up so much that people weren’t buying steaks as often, so they replaced steak with hamburger “because that’s what the people are buying.”
From a statistical point of view this means that comparing values of the CPI from before the change to after the change really doesn’t work.
By changing the definition they lied with statistics.
In the gun rights world we see this with the definitions of “mass shooting” and “school shootings.”. The FBI has defined both of those. Fortunately it turns out that there are very few mass shootings and fewer still mass school shootings. By changing the definitions you can increase the reported number of mass shootings.
So instead of “4 or more killed excluding the shooter” we end up with the much more inclusive definition of “4 or more people killed or wounded including the shooter.”
The FBI excludes gang violence. The fearmonger definition does not. This is why we hear claims of “more mass shootings than days”. If it was truly the case that there was a mass shooting a day we would be hearing about it non-stop. Instead, most of the shootings are regulated to gang shootings and it doesn’t make the news.
They lied with statistic by changing the definition of mass shooting to include many more events. Events that most people would not consider to be mass shootings.
The fear mongers have redefined “school shooting” in a similar way. Instead of a shooter entering a school and shooting students and staff they use “a gun was fired on school property.” That’s how we end up with school shootings including a man that committed suicide in a school parking lot. Drug deals gone wrong on school property after hours and after dark. A shooting in a school bus parking lot after midnight. Yes, all these took place on school property, but most of them did not involve actual students in school.
Another example of the definition game is in defensive gun use. The article that Miguel posted on July 25, 2022 uses a definition of “killing the suspect/attacker/criminal.” This definition ignores merely wounding the animal. It ignores presenting the weapon and having that stop the criminal act. It ignores all the other ways that a DGU happens where nobody ends up dead.
One last definition game, conflating suicide, justifiable homicide and murder in the term “gun violence”.
Next on our list of ways to lie with statistics is sample selection. The advertisements use to be “4 out of 5 dentists recommends Crest Toothpaste.” That is a cool trick, how many dentists did they sample? Did they sample 5 and stopped with the first that said something else? Or did they sample a few thousand?
How did the select that sample? Did they send out free “patient cleaning kits” to dentists with the small tube of Crest and include a survey form? Did they call dentists that had gotten samples? Or did they call a random yet large sample of dentists?
This is how you can have the same survey question asked on two different platforms and get completely different results. If you asked “What is better, to hardent schools by allowing teachers and staff to carry or to add metal detectors at the entrance?” here or on Daily KOS you will get very different answers. Same of asking Fox viewers v. CNN viewers, you get vastly different answer because your sample selection is vastly different.
In polling it is also an issue with what questions are asked and how.
All of these lead to answers based on statistics that completely lie.
Never ever trust a statistic without knowing what was measured, how it was measured, what the error rate is, what the standard deviation is.
Why do they think we are so stupid that we can’t see what they are doing?