Did Indians ever live on my property? I do a careful search for pottery and arrowheads, don't find any, and conclude the answer is no.
But even substantial habitation sites go unnoticed because artifacts get scattered and buried by later deposits. I just read of a couple of quite large settlements found in the newly expanded part of Petrified Forest National Park. Also, before my home was built, the area was probably farmland, which means the surface was disturbed, and artifacts that were noticed were picked up. Then the lot was disturbed when the house was built. Finally, even with millions of people over thousands of years, habitation sites made up only a tiny fraction of the landscape, so there is a very good chance that my home never was part of a habitation site (except mine). Therefore, absence of evidence isn't evidence of absence.
Suppose we try a different hypothesis. Indians had technology comparable to our own. They had good highways, flight, electricity and large cities. We just can't find any evidence because "absence of evidence isn't evidence of absence." In this case, the maxim is ridiculous. An advanced civilization on a par with our own would have left gigantic amounts of evidence. The fact that we don't find it is solid evidence that it doesn't exist.
And of course absence of evidence is used all the time in science as evidence of absence. We say the dinosaurs went extinct 67 million years ago, instead of simply hiding like in Dilbert, because we just don't find dinosaur fossils after that time. We are sure humans didn't come to North America before perhaps 20,000 years ago because of the total lack of evidence for human remains and artifacts from before that time. Absence of even rare things like dinosaur fossils and artifacts, becomes significant if it extends over a large enough area.
So the real principle is pretty clear. Failure to find something that's normally uncommon may mean you simply missed it, or by chance it never occurred in your search area. Failure to find something that should have left widespread and obvious evidence, is actual evidence of absence.
Finally, consider this. The police search your home for drugs and child porn. After an exhaustive search, they don't find any. But the DA takes you to trial anyway, arguing that "absence of evidence isn't evidence of absence." How do you feel about that maxim now?
The Plural of Anecdote is not Data
What on earth is data if it's not a large collection of individual observations, any one of which could be called an "anecdote?"
The problem with anecdotal evidence is that, to be valid, the anecdote has to be true and it has to be representative. The true part seems self-explanatory, except the world is full of urban legends (and a salute here to Jan Harold Brunvand for introducing the concept) that people pass along and embellish because they sound plausible. For example, I heard of a small child who was allowed to go to a public rest room by himself. There was a scream and a man rushed out. The parents found the child sexually mutilated. I heard the same story twice, years and half a dozen states apart. Then, during the Gulf War, I heard of a U.S. serviceman who had an affair with a Saudi woman. They were found out, the serviceman was quickly spirited out of the country and his hapless lover was executed. By this time, I knew about urban legends and recognized this tale as one immediately. My suspicions were confirmed when I heard the same story a second time, with a few details changed. So it pays to check sources.
However, it's my experience that when people demand sources for an incident, they rarely care anything about accuracy or intellectual honesty. Two minutes on Google will usually reveal whether a story is based on reputable sources or not, although even reputable media outlets regularly get conned. Much more often, a demand for sources is a cheap and lazy way of discrediting something the hearer doesn't want to accept.
Then there are what I call "gee whiz" facts. Things that sound impressive at first but turn out to have no substance when you look closer.
- "A million children are reported missing every year." That means that 18 million children, roughly one in four, would disappear by the time they reach 18. I suspect we'd notice that. Yes, a million children are reported missing every year, but the vast majority are found within 24 hours. And most of the long-term disappearances are due to non-custodial parents.
- "Suicide is the --th leading cause of death among teenagers." Without in the least making light of this issue, what kills teenagers? They're beyond the reach of childhood diseases and not vulnerable yet to old age diseases. That leaves accident, suicide and homicide, and anything but that order means there's a real problem. So suicide will always be a leading cause of death among teenagers because they die of so few other things.
So instead of blowing off evidence as "anecdotal," ask whether it's true and representative. Yes, this may mean going to Google and getting your widdle fingers all sore from typing, but you might actually learn something. Oh, have you also noticed that conservatives tell "anecdotes," but liberals tell "narratives?"
Correlation Isn't Causation
Okay, then, what does demonstrate causality if it's not observing similar results time and time again? If you repeat a cause, and observe the same effect, especially if you can change details and predict how the results will change, you have a pretty ironclad case for causation. The laboratory sciences use this approach as the conclusive proof of theories. The problem arises when we look at one of a kind situations or events in the past and try to use data to figure out why things happened as they did. In those cases we can't reproduce all the potential causative factors at will.
Now a plot of my age against the price of gasoline shows a pretty linear trend, so either my getting older is making gasoline more expensive, or gasoline getting more expensive causes me to age. Interestingly, the big drop in prices in late 2014 didn't make me any younger (sob). And there are tons of joke examples. A site on spurious correlations shows graphs of:
- US spending on science, space, and technology <---> Suicides by hanging, strangulation and suffocation
- Number people who drowned by falling into a swimming-pool <---> Number of films Nicolas Cage appeared in
- Per capita consumption of cheese (US) <---> Number of people who died by becoming tangled in their bedsheets
- Age of Miss America <---> Murders by steam, hot vapours and hot objects
- Per capita consumption of mozzarella cheese (US) <---> Civil engineering doctorates awarded (US)
Listing stuff like this is like eating potato chips; it's hard to stop. The Nicolas Cage and Miss America examples are especially nice because there are a number of peaks and valleys in the graphs. But all these graphs have correlation coefficients near or above 0.9. If I produced a similar graph showing, say, incidence of spanking versus psychological problems in later life, any social science journal would accept it. (If I had a similar correlation between spanking and success in later life, I'd have a lot more trouble.)
In order to be at least plausible evidence for causation, there has to be a plausible causative link. Maybe people get so distraught at spending money on space that they hang themselves. Maybe some movie goers disliked Wicker Man or Con Air so much that they drowned themselves in their swimming pool. But I doubt it. On the other hand, I recently plotted up vote tallies, race and poverty in the Deep South and found that they had insanely high correlation coefficients. There's no doubt about the connection there. Poverty is concentrated among blacks, who overwhelmingly vote Democratic.
No, nobody cares if you don't like the implications. If someone plots usage of pot or pornography against some negative social outcome and finds a strong correlation, and suggests a causal link, that's evidence for causation. The fact that you may not want to believe it is irrelevant. And your acceptable counter-strategies boil down to:
- Discredit the data. Show that the data are wrong or cherry-picked.
- Discredit the correlation. Show that the correlation doesn't hold in other, similar settings, or if you broaden the time or space range.
- Discredit the causal mechanism.
And "Correlation doesn't prove causation" doesn't even apply to unique events. "I inoculated my kid, and she developed webbed feet and grew horns." That's not even correlation because there's nothing else to relate the event to. That's more like "Coincidence isn't causation."