Why Drug-sniffing Dogs Don’t Work: Racism and the Clever Hans Effect

Support more videos like this at patreon.com/rebecca!

Sorta transcript:

Are police officers allowing their personal biases to influence how they treat the general public and possible criminal suspects? Yes, obviously. Have you not been paying attention?

What’s newsworthy at the moment, though, is the discovery that those personal biases are even screwing up canine units. Police often use drug-sniffing dogs in places like border crossings to identify and apprehend possible smugglers. Dogs are really good at this job: they have about 50x more olfactory receptors in their noses than humans, and the part of a dog’s brain that processes that information is about 40x the size of ours, proportionately speaking.

Dogs can use their noses to find drugs, people, and according to some preliminary research, even cancer cells. Here’s the thing, though: they need to be trained by a human to do that, and after training they have to work with a human to properly apply their training. And that’s where we run into the Clever Hans effect.

Clever Hans was a horse who got famous in the early 20th century for supposedly being able to solve math puzzles. You would ask him or write a problem on a chalk board, like “1+2”, and Clever Hans would stamp his hoof three times. Some of the questions were pretty complex, even, like “If the eighth day of the month comes on a Tuesday, what is the date of the following Friday?”

Clever Hans was usually right, but when researchers studied him more closely, they realized that he was only good at getting the answers right if the person asking the question knew the correct answer, and if Hans could see the questioner. In other words, Clever Hans really was clever, but he was much better at the art of reading a person than at mathematics. Even his owner probably didn’t realize that this was happening, at least in the beginning.

It’s a great example of the importance of double blinding your experiments — making sure that the person running the experiment doesn’t know the desired result, and that the subject of the experiment doesn’t, either. Because it turns out it’s really hard to remain perfectly unbiased, as we’re all constantly giving off subtle hints about what we’re thinking or what we want to happen, and these can unknowingly influence the experiment.

You may already be able to guess how Clever Hans applies to drug-sniffing dogs. The dogs are usually properly trained in how to detect a drug, but when they’re working in the field, they are with a human who they desperately want to please.

A federal appeals court has just ruled that the use of a drug-sniffing dog was legal, despite the findings that the dog in question, Lex, signalled that he had found drugs 93% of the time he was used. His actual success rate at finding the drugs was only 59%. So he was constantly subjecting people to invasive searches, but was barely better than a coin flip at actually finding drugs.

It turns out that Lex’s handler was giving him a reward every time he alerted, whether or not the alert led to a drug discovery. So he learned that alerting = treat.

It also turns out that Lex’s success rate is higher than what was found in a study of Chicago drug dogs, who managed to correctly ID drugs just 44% of the time, or worse than a coin flip. And if you narrowed it down to Latino drivers, the dogs were only accurate 27% of the time. But they’re still used, and the federal courts have given their approval for their continued use, despite the fact that there are no controls in place to prevent the Clever Hans effect from subjecting millions of innocent people to unnecessary search and seizure.

The only scientifically reasonable next step is to either stop using the dogs or better yet, stop using police officers, since an unbiased dog should be able to do their job about 50x better than they can, and we’d probably see a significant decrease in the number of innocent people gunned down without provocation. Also a significant increase in tummy rubs. Everyone wins.

Rebecca Watson

Rebecca is a writer, speaker, YouTube personality, and unrepentant science nerd. In addition to founding and continuing to run Skepchick, she hosts Quiz-o-Tron, a monthly science-themed quiz show and podcast that pits comedians against nerds. There is an asteroid named in her honor. Twitter @rebeccawatson Mastodon mstdn.social/@rebeccawatson Instagram @actuallyrebeccawatson TikTok @actuallyrebeccawatson YouTube @rebeccawatson BlueSky @rebeccawatson.bsky.social

Related Articles


  1. Now that I think about it, this was an inevitable result of using dogs along with police officers. Despite evidence to the contrary, most cops are still human. Dogs have been selectively bred over thousands of years to be exquisitely sensitive to human cues. Try pointing your cat’s attention to something and you’ll see the difference. (And I’m saying this as a cat person.)

    Maybe robots are the answer. And if you can program one to also do petting, I have some cats who might be interested.

  2. I’m relying on memory here (sorry) but I recall reading that ONLY Kluge Hans’ owner triggered the ‘stop stamping’ response. Apparently he looked down at Hans’ feet while he stamped, and would look up when the desired total was reached. In addition, he wore a broad brimmed hat at all the sessions so the movement was even more obvious.

    Big problems with selective ‘cuing’ by cops in other areas. Photo line-ups are notorious for leading to false identifications. Witnesses often aren’t told that they may be shown several ‘six ups’ of images. So the tendency to pick the nearest match on the first sheet (especially if the color matches) is of huge significance.

    I don’t remember the psychologist (maybe Loftus or Ofshe) who has been campaigning to change photo line-up techniques. They claim that letting the witness leaf through a stack of individual images drastically improves accuracy.

    There might be a way of reorganizing the use of dogs that would reduce cuing by handlers. But like the improved photo system, the police will resist any change.

  3. 50% level could be much higher than chance. If i can point out 50 of all librarians i walk by each day. It could be chance if claimed 1/2 was one. If i only claimed it about 1/1000. It would be quite good

    1. Bah, the whole War On Drugs is a big waste of time and money not to mention yet another excuse to oppress minorities. But leaving that aside for the moment!

      It is easy to get confused with this sort of analysis, there is an introduction to it here: https://en.wikipedia.org/wiki/Sensitivity_and_specificity

      From the figures in the article we have (out of 100)
      true negatives 7, false negatives 0??, true positives 59, false positives 34.

      So sensitivity 100%, specificity 17%, predictive value 63%, negative predictive value 100%, accuracy 66%

      You are right that predictive value depends on prevalence. It is as much a statement about the proportion of actual positives in a population as it is about the test.

      In this case it seems that the cop has done quite a “good” job (!) to pull a high proportion of drug carriers out of the general population and the dog has added very little to the picture.

      This analysis may be wrong depending on exactly how these figures were obtained.

      1. In particular, it is not clear if the 93% and the 59% figures came from the same exercise. If they came from different populations, the analysis fails.

  4. Possibly why a lot of the sniffer dogs at the airport work in the luggage area, can’t be biassed against luggage.

  5. Just to clarify, if a dog can pick 50% of drug carriers in a population of 100 with only one carrier, the dog picks 2 people to be searched; one is released and one is busted.
    This is impressive and much better than a coin toss!

    If the population has 50 carriers, the dog picks most of the 100 to be searched.
    Not impressive and equivalent to a coin toss.

    The quoted statistics could fit either scenario depending on how they were obtained.

    This was discussed here a few years ago and I repeat, there needs to be a quality control survey for accreditation like we have in diagnostic labs.

    A central authority mails out blind samples to test, some positive and others negative, on a monthly or other regular basis. Results are posted back to the authority who performs statistical analysis for all to see. Anonimity is preserved but Management reviews and draws conclusions, especially if you fall in the lowest percentiles! Changes are made and gradually results improve.

    Maybe the cops do some variation of this but if so it seems ineffective.

    1. While that’s true, it’s totally irrelevant. Rebecca addressed that already. This dog alerted 93% of the time, and there were drugs 59% of the time. So the dog was alerting on essentially every car, and 60% of the cars he alerted on had drugs. If we make the maximally-generous assumption that he alerted on every drug-carrying car that he saw, then in 1000 cars that he sees, he alerts on 930 of them, and 549 of those have drugs. That means that 55% of cars had drugs, and the dog was identifying drugs in 59% of his identifications, which is 4 percentage points better than flipping a coin would have done.

  6. Everybody implies the dog test to be bogus, but the whole thing is loaded with assumptions.

    For instance, we do not know if the unsearched cars were carriers.

    I find it hard to believe that 59% of US drivers carry drugs.
    We do not know for certain if that figure came from the same data set as the 93% although everybody assumes so- it could well have come from training statistics. This is the trouble with using secondary and tertiary sources, it would not be the first time unrelated figures have been smooshed together.
    Perhaps Rebecca has checked this aspect?

    That is one hell of a target rich environment. In that environment, the false positives could well be the dog alerting on traces of drugs. In any case, a high sensitivity, low specificity test (dog’s nose) would show its value much better in a target poor setting, such as I described above, used as a screening test.

    We do not know and cannot say that the dog would alert 93% of the time if most of the cars were truly negative.

    Anyway, the whole War on Drugs disgusts me when people are being shot for walking while black. Shame!

  7. ? ? ? ? ? ?? ? ? ? ? (continue with ‘win’ emoji)

    Yeah, I remember back in the mid-90s when I first heard of ‘racist’ dogs, and it was, of course, the Clever Hans effect. How soon people forget.

  8. “The only scientifically reasonable next step is to either stop using the dogs or better yet, stop using police officers, since an unbiased dog should be able to do their job about 50x better than they can, and we’d probably see a significant decrease in the number of innocent people gunned down without provocation. Also a significant increase in tummy rubs. Everyone wins.”

    Or we could just legalize drugs and just be done with all this foolishness.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to top button