Let’s Get Empirical!
As I mentioned in my post yesterday, I don’t really have a post’s worth of material on the subjects of guns.
However, while reaching this conclusion, I realised there was a contribution I could make to this topic. Whenever the social effects of guns are debated, studies on the relationship between guns and violent crime are often produced by both sides. On the whole, this is a good thing – policy should be evidence based, unless you’re holding to a strictly deontological position.
I have no special insights to offer on the merits of any given study – crime isn’t a subject I’ve studied very much. But I’m still conversant with the basic empirical methods used to research a question like this, so I thought it might be useful to examine the empirical challenges researchers have to face in investigating an issue like guns and their effect on crime. This will hopefully help people to know whether the authors of any given paper have done their homework.
Measurement
Before you can even think of researching a question like “What effect do guns have on crime?” You have to clearly define guns and crime. This is less straightforward than it may first appear, for example:
- Do you look at all crimes equally? This means that preventing 2 cases of trespassing or petty theft is cancels out causing 2 murders.
- Or you could look at violent crime only, though this will lead to excluding non-violent offences that still affect people’s quality of life, and it still bundles common assault with murder.
- If you’re feeling fancy, you could build a severity-weighted crime index. Of course this creates a variable that is vulnerable to police “juking” the crime statistics, since it now matters to your analysis how a given crime is classified.
- In practice, crime researchers often focus on homicides for data quality reasons – you can’t juke a dead body, and it is the most severe type of crime. Plus, homicides have the best data.
For guns, and important consideration is how to deal with different types of guns. Is a semi-auto rifle the same as a shotgun the same as a revolver the same as refurbished flintlock (and if not, how are you going to treat them differently)? Also, are we counting guns, or people with guns?
There are no objectively superior measures of crime and guns, it all depends on what you can get data for and what hypothesis you are trying to test. The important thing to ask is “What are the authors actually measuring here?“, because if an article is being cited in support or opposition to a proposed gun control policy, the specific hypothesis matters. A law that would change the composition of guns that are owned, but not the number can’t be tested by looking at the total number of guns. Equally, a measure that would limit how many guns a person can own can’t be evaluated by comparing the proportion of gun owners to crime rates.
Finally, check the authors aren’t comparing different population centres (or the same centre over a long time period) without adjusting for population size. After all, all things being equal a city that is 10 times the size of another will have 10 times the crime, and 10 times as many guns. This is a rookie mistake (or a sign of shenanigans), so you shouldn’t see this problem very often, but still something to look out for.
Signal and Noise
Once you’ve settled on your measures, the next step is to try and work out how much of the change in your dependant variable (the thing you are trying to understand) is correlated with changes in your explanatory variable (the variable you think causes changes in the dependant variable), as opposed to any of the other things that might be affecting it.
Figuring out what causes what is called the attribution problem, and it has been an on-going problem since the dawn of humanity’s search for knowledge. The best solution is the controlled experiment, where you set up two identical situations, change one thing in one of them, and watch what happens. This gives you nice, clean comparative data. However, it is generally very difficult to perform a controlled experiment in the social sciences (or even in some of the physical sciences), which means that attribution has to be determined through less direct methods.
When looking at real world data, rather than experimental data, you cannot simply assume that any change in your dependant variable can be explained by changes in your explanatory variable. For example, consider the following argument:
1 – The UK has a lower murder rate than the US.
2 – The UK has stricter gun controls than the US.
3 – Therefore, introducing stricter gun controls in the US will lower the murder rate.
Point 3 does not follow from points 1 and 2. Aside from the fact that correlation isn’t the same thing as causation, there are any number of other factors (collectively referred to as confounds) that could be affecting the murder rate other than gun control. Here’s a list I came up with after thinking about it for a minute:
- Population density
- Law enforcement approaches
- Income levels
- Levels of social trust
- Cultural views on the acceptability of violence as a dispute-resolution mechanism
- Preferences for owning guns (i.e. it may be that UK gun control does nothing but because people over there don’t want as many guns as Americans do, the result is still less violence)
The only way to sort out this mess of confounding variables is to use statistics to estimate the effect (if any) of all these variables. This means a lot of data (i.e. more than two countries), and looking at all the data at once (not just comparing two variables at a time). If you don’t account for all the confounding variables (or, more realistically, as many as you can), then you will end up giving too much weight to the variables you did put in your model.
Whenever you’re reading a social science study ask yourself “What variables did the authors control for? Are there other variables they could reasonably have controlled for?”
Endogeneity and Causation
The statistical methods I described above will deal with most of your data-related needs. Sure, there are hundreds of modifications one might need to employ to deal with particular quirks with your data, but the essential principle is the same.
There is, however one complication that is harder to deal with – endogeneity. Endogeneity is a situation where your dependant and explanatory variables have a causal effect on each other. For example, if guns cause crime and higher crime drives demand for guns (or the reverse, or any combination of those things) then we have a serious problem. The statistical approaches to data analysis I mentioned above only work when causal arrows run in one direction. When you have endogenous variables, there’s no way to disentangle the two causal effects (A causing B and B causing A), which means that you inevitably end up with an utterly misguided analysis.
There is no mathematical trick to get around this problem, the only solution is to look for an instrument, a change in the independent variable which couldn’t possibly have affected the other. For example, if a pack of ravenous rust monsters descended on the United States and destroyed most of the guns in the country, we could see what happened to crime rates following the Great Rust Monster Incident of 2013 without wondering if it was crime affecting the number of guns (unless the rust monsters were released by some nefarious Dungeons and Dragons themed master criminal, in which case all bets would be off). This approach to analysing endogenous variables has been most notably popularised by Freakonomics. While the approach Levitt and Dubner take to analysing serious phenomena in quirky ways may seem like light-hearted entertainment to the uninitiated, it serves a serious purpose – it is when weird thigns happen that instruments are most likely to be found. And since researchers can’t create an instrument, they have to wait for one to turn up, every anomalous event is a potential research paper waiting to be written. Instrumental variable analysis also helps resolve the problem of causation – since you’ve sorted out which way the causal arrow runs before you even start crunching numbers.
When reading a study ask yourself “Is this variable likely endogenous? If so, have the authors used an instrument?” If they haven’t, then the study is highly suspect.
There is a lot more about this subject I could go into, but things would start to get pretty technical from here on in, and I think these are the primary things a lay audience would be able to spot. So whenever you are reading an empirical study, especially if you agree with the conclusions- remember to ask these questions and you’ll keep out of trouble for the most part:
- What are the authors actually measuring here?
- What variables did the authors control for? Are there other variables they could reasonably have controlled for?
- Is this variable likely endogenous? If so, have the authors used an instrument?
Wow, this should be something we link to a lot. It’s like a League version of The Demon Haunted World.Report
Being likened to Sagan? That’s the sort of thing that could go to one’s head.
Thanks Tod.Report
Having a pic of Olivia Newton John would have been funnier. Just saying. Pointless pop culture references add credibility.Report
I disavow any knowledge of the relevant pop culture reference [shiftyeyes]Report
heavy sigh…kids these days…..Olivia Newton John, an aussie , had a hit many years ago called Physical with a endlessly repeated chorus of “Lets Get Physical…Physical”Report
Sorry, did the “shiftyeyes” notation not make sense to you?Report
I assumed it was just Kiwi nervousness acknowledging anything Ozzie-related.Report
Excellent.
Human beings are wretched assessors of risk. Without the sword and armour of data and proper analytical methods, we resort to emotions and not good sense. We’re afraid of terrorists, of mad shooters in malls and schools, of the statistically rare. We’re not afraid of situations we should fear: they’re so common we can’t go on with our daily lives fearing them. The sleep of reason breeds monsters.
Blaise Pascal:
On what shall man found the order of the world which he would govern? Shall it be on the caprice of each individual? What confusion! Shall it be on justice? Man is ignorant of it.
Certainly had he known it, he would not have established this maxim, the most general of all that obtain among men, that each should follow the custom of his own country. The glory of true equity would have brought all nations under subjection, and legislators would not have taken as their model the fancies and caprice of Persians and Germans instead of this unchanging justice. We should have seen it set up in all the States on earth and in all times; whereas we see neither justice nor injustice which does not change its nature with change in climate. Three degrees of latitude reverse all jurisprudence; a meridian decides the truth. Fundamental laws change after a few years of possession; right has its epochs; the entry of Saturn into the constellation Leo marks to us the origin of such and such a crime. A strange justice that is bounded by a river! Truth on this side of the Pyrenees, error on the other side.
People admit justice does not consist in these customs, but resides in natural laws, common to every country. They would certainly maintain it obstinately, if reckless chance which has distributed human laws had encountered even one which was universal; but the farce is that the caprice of men has so many vagaries that there is no such law.
Theft, incest, infanticide, parricide, have all had a place among virtuous actions. Can anything be more ridiculous than that a man should have the right to kill me because he lives on the other side of the water, and because his ruler has a quarrel with mine, though I have none with him?
Doubtless there are natural laws; but good reason once corrupted has corrupted all. Report
Very true.Report
The Law of Large Numbers brings interesting and horrible things into being. The troubling part of the Gun Debate resolves to this: as we add more guns to the equation, we bring these statistically-rare events out of the realm of the improbable into the real world.
We shouldn’t wonder to see so many Sandy Hook tragedies. Given a sufficient number of guns and ammunition and integrate a sufficient number of crazies, it’s not a question of If any more. There will be more Sandy Hooks and Columbines and Virginia Techs. Absolutely predictable.
Can’t say that to the Gun Fans, though. Disturbs too many comfy and completely unscientific presuppositions.Report
This is definitely one of those cases that what makes for bad social science (small data sets), makes for a better world.Report
This was fantastic. I really appreciate having tools to understand social science research and its differences vis-à-vis medical research.Report
This is awesome. Excellent explanations. I loved it straight away!
Then you threw in the wholly unnecessary reference to a Rust Monster, and just you blew me away!
I love you man. Or, at least, my 1st Edition 12th level Ranger does.Report
And my Pathfinder 12th level Alchemist salutes you.
Geek Power!Report
Really good article.
But even the more straightforward statistics can have complications. Crimes reported versus crimes committed. Deaths not properly identified as murders. If you’re using criminal convictions, a lot of them are plea bargains down from more violent crimes. If you’re considering gun ownership, you have to remember that many people who have guns are exactly the kind of people who won’t tell the clipboard-bearing stranger knocking on the door that they have guns.Report
Yes, definitely.Report
I should think one of the big problems is the lack of statistical information on sales, etc.
Last week NYT ran a pretty amazing piece on the difficulty of tracking a gun down — serial # to manufacturer, then to seller, then to sale. That’s a nightmare.
It disturbs the same way treatment information in the health care debate disturbed — the data need for critical analysis belongs to the manufacturers and sellers, it’s proprietary, and likely a valuable marketing data point for the industry; in health care, that same information belonged to the insurance companies — diagnosis attempts, treatments, outcomes — much of the information needed to understand how the health care system actually functions.
But here’s something: if you buy a gun, what happens with the information about you — your name, address, etc.? You think the gun industry protects your privacy, or are you added to data bases and mailing lists and sold to bidders without your permission? Do you have a history that follows you, from gun sale to gun sale, from ammo purchase to ammo purchase?
Before anyone goes and gets all het up here, I have the same concerns about data-sharing from perceived liberal things, too. I suspect we have little notion of how manipulated we are.Report
The three most important things in research are replication, replication, and replication.Report