A few weeks ago, a friend of mine texted asking if any of our group knew of Dan Ariely, because she was interested in his first book “Predictably Irrational.” In fact, I do know of Ariely as I read that book when it came out. So I replied thusly:
“it’s an entertaining book, though like all pop psych stuff it makes a lot of unsupported leaps
lot of weird, fun studies that have probably never been reproduced”
WELL, what are the chances? Because just a week or two later, the news hit that one of Ariely’s studies has been retracted due to evidence of fraud. Oops!
Okay, here’s the background: Ariely is a behavioral economist — so, economics but as it is affected by humans with all their weird human psychology. He has a really cool and sympathetic backstory as he described in Predictably Irrational: he got interested in applied psychology when he experienced a severe accident as a teenager that resulted in extensive, painful burns across most of his body. During his recovery he became interested in how to better treat patients in horrible situations like his.
So he started doing these studies that were very well-timed because pop psychology was just having its moment. Like, what if instead of charging a set price for a slice of pizza, you made people pay a small fee for every bite. They hate it! Science.
I’m being glib, but I really did enjoy and learn a lot from his research, like this study published in a letter to JAMA in 2008 that found that if you sell someone a useless sugar pill for whatever ails them, they will think it works better if it costs them more money. They did this by giving 80 people a (fake) “painkiller” pill and then shocking them. Like, literally, shocking them with electricity. Half the people thought they were taking an expensive pill, and half a discounted pill. The expensive group thought the shocks hurt less compared to the discounted group.
See? Fascinating, and it fits in with what we know, or want to think we know, about human nature. Why do people pay so much for Gwyneth Paltrow’s bullshit? And why do they keep paying even after trying it once? Surely they see they got ripped off, but no! This suggests that maybe the price they’re paying actually makes them feel LESS like they’re getting ripped off.
But is it true? Well! Other studies have supported it, like this paper from 2013 that found that patients who took a “generic” version of a drug compared to the “branded” version experienced more negative side effects and lower effectiveness. I’m side-eyeing those error bars but hey, even a small result is an interesting result that supports the previous study. But it’s still not a replication, in which a (ideally) new researcher does the exact same study to see if they get the exact same results.
Which brings us to a study Ariely helped run and published in 2012 in PNAS titled “Signing at the beginning makes ethics salient and decreases dishonest self-reports in comparison to signing at the end.”
The crux was this: is there a simple way to get people to be honest when you have no real way to easily check that they’re being honest? For instance, the IRS doesn’t know if you’re being honest when you file your taxes each year, until they have a chance to closely examine your return and other documents. That takes time and money and energy, so if there’s a way to get people to be more honest on their own it may be worth it for them to do so.
In the case of your tax return, the IRS has you sign your return at the end promising you’ve been truthful. Ariely et al thought that maybe people would be more likely to be honest if they sign that declaration at the beginning of their return — for instance, you might be more likely to spend your time filling out the return thinking about the declaration you just signed, even if that thought is lurking in the back of your subconscious.
This is a similar concept to research exploring how our inherent biases may screw with us — some studies have suggested that if students have to mark down whether they’re male or female before a math test, women are more likely to do worse, because merely calling to mind that they’re a woman also calls to mind all the biases associated with being a woman, like how women just aren’t as good as men in math.
To test their honesty hypothesis, Ariely and his team conducted a few experiments in the lab, like having subjects take a quiz where they have the opportunity to cheat for financial gain. They found that subjects who had to sign at the top of the paper cheated less often than those who signed at the bottom or not at all.
Then came the field test, in which they solicited help from a car insurance company. They had the company’s customers fill out a form with their odometer readings, and had them sign at the top or the bottom indicating they were being honest. Driving fewer miles would result in lower insurance premiums, so they had a financial incentive to lie. They also had a previous odometer reading from these same customers.
They found that customers who signed at the top of the form reported more than 10% more miles driven than those who signed at the end, and considering that they surveyed more than 13,000 customers, that’s a pretty solid result.
And so for the past ten years, this paper has been cited repeatedly by other researchers, been incorporated into corporate policies, and even considered by governments in regards to tax documents.
But in March of 2020, there was an update. Three of the original researchers (which did not include Ariely) tried to replicate the findings from that initial paper, first in the lab. They redid the “you can cheat for financial gain” quiz, but instead of using 100 subjects they used 4,559. And with that large number of participants, they saw the effect completely disappear. It didn’t matter whether they signed at the beginning, the end, or on their own foreheads, they all lied or told the truth at the same rate.
“Well that sucks,” they probably said. “If it doesn’t actually work in the lab…what about the field experiment?”
So they went back and looked at the data and realized their two groups — signed at the top or signed at the bottom — weren’t equivalent. The group that signed at the top ALREADY DROVE MORE MILES compared to the group that signed at the bottom. So the difference between them wasn’t because the top-signers were telling the truth about driving more, it was that they actually were driving more and the bottom-signers actually were driving less.
All the initial authors, Ariely included, signed a statement in Scientific American saying “hey, our bad, this top-signing thing isn’t actually real.”
There we have it, a wholesome story about scientists going back to actually replicate their findings, getting a negative result, and coming clean about it. That’s how science is supposed to work. Awesome.
After they published that letter, OTHER researchers had some questions. Like, you RANDOMLY split 13,000 people into two different groups to measure their self-reported mileage and you just happened to accidentally have one group that drove way more miles than the other half? I mean, sure, it could happen accidentally but…maybe we should have a look at that data.
Luckily, the researchers who found all this in 2020 posted all the data, so other researchers downloaded it and looked it over. And what they found was, well, sus. Are we still saying sus? Were we ever saying sus? This data definitely killed a crew member and then hopped in a vent.
When the anonymous researchers plotted the data from the original field experiment, they found that the distribution was weird as all heck. Here’s what you might expect to see: most people drive 5-10,000 miles, and the number drops off on either side of that — fewer drive 2-5,000 and 10-15,000, and even fewer drive 251 to 2,000 and 15 to 20,000, and so on.
But the data from the 2012 study showed that each category of miles driven had almost exactly the same frequency — there are exactly as many people who drove 10,000 miles as drove 20, 30, 40, and 50,000 miles. Across 13,488 “random” people. Weird.
The anonymous researchers also found that in the initial odometer readings already on file with the insurance company, a lot of people clearly just rounded their readings. So, say, if it read 9,043 they would just say it was at 9,000 because who cares? You can tell that because those nice round numbers showed up more frequently in the data set.
Guess what was different about that second data set? That’s right, round numbers were not more frequent. “9.043” was reported exactly as often as “9,000.” Weird if we’re talking about the same humans who were totally fine with rounding on their first take. Not weird if we’re talking about, oh, I don’t know, a random number generator.
There are more anomalies as reported over on Data Colada, like strong evidence that the initial dataset was also manipulated using a random number generator, but that should be enough to understand their conclusion: this is fraud.
Fraud on whose part, though? Well, of the five authors of the initial paper, it is the fourth author who all agree is the only person who handled the car insurance data: Dan Ariely. Ariely responded with a brief note, pinning the blame on the insurance company. He writes that he “did not test the data for irregularities, which after this painful lesson, I will start doing regularly.”
That leaves us with a few options for what really happened here. If Ariely is being honest, then it means that someone at the insurance company decided to not bother typing in all the real data from the customers (which they had obviously done in the past, so this wasn’t a ridiculous request) and instead found a random number generator to create all the data out of thin air. Then they went back to the original dataset they already had and duplicated them (while making the mistake of switching the font from Cambria to Calibri, which is how the anonymous reviewers noticed) and then adding a random number between 1 and 1,000 to exactly half of the baseline reported mileages.
This resulted in an approximate 10% increase in one half of the reported mileages compared to the other half, which just so happens to exactly confirm the hypothesis of the researchers, which the employee at the insurance company would surely not have known.
The evil insurance company employee then packed up the data and sent it to Dan Ariely, who for some reason decided to not do a single basic check on the data to be sure it looked right, and who simply goes ahead and publishes.
That’s one option. The other option is that Dan Ariely made up the data to fit his hypothesis.
You can decide for yourself which of those is most plausible but before I leave you to it, consider this reply from one of the other co-authors of the 2012 study, Max Blazerman. He looked over the data analysis and said he agrees that the data is clearly fraudulent. He goes on to say:
“There were indications of problems from the start (2011).
“The first time I saw the combined three-study paper was on February 23, 2011. On this initial reading, I thought I saw a problem with implausible data in Study 3. I raised the issue with a coauthor and was assured the data was accurate. I continued to ask questions because I was not convinced by the initial responses. When I eventually met another coauthor responsible for this portion of the work at a conference, I was provided more plausible explanations and felt more confidence in the underlying data. I would note that this coauthor quickly showed me the data file on a laptop; I did not nor did I have others examine the data more carefully.”
So if you really think the “evil car insurance employee” hypothesis, then you should also accept the fact that in the case, in the very best light, Dan Ariely is completely inept at his job. Not only did he “not test the data for irregularities” but according to his co-author when those irregularities were pointed out early in the process he dismissed them and reassured his co-author that “the data was accurate.” Despite not testing it.
I love not being sued, so I won’t tell you whether I think that Dan Ariely is an out-and-out fraud or simply too stupid and incompetent to look at the data his own co-author thinks is suspicious before publishing it. I just know that I won’t be buying any more of his books.