It’s no secret that I’m a sucker for articles pointing out why some giant study being reported on in the media is actually bunk. In fact, that’s pretty much the only thing I ever write about here at Skepchick. So, this week when I saw everyone on my facebook feed sharing an article from the Awl entitled “Gaslighting Millenials: That Big Study About How the Student Debt Nightmare Is in Your Head? It’s Garbage” which promised to be a takedown of a New York Times article, I immediately jumped on it.
I started out completely onboard with the Awl’s takedown of the Grey Lady herself, but the further I got in the article the more I started to wonder if perhaps the takedown itself was the one in the wrong. Although the author seemed well meaning, he also didn’t seem to have much statistics knowledge or understand what exactly makes a strong versus a weak study.
I followed up reading the Awl piece by reading the NYT article that it was criticizing. The Awl strongly implies that the NYT article is about how there are not really any issues with massive amounts of student loan debt in the US, whereas the NYT piece doesn’t actually say that at all. In fact, the NYT article is not about how there are not problems with student loan debt but only that the problems do not exist in the stereotypical cliché way we typically think of them. The cliché of students with massive amounts of debt, according to the NYT, are students who went to private schools and racked up tens or hundreds of thousands of dollars in debt. The article then provides evidence from the Brookings Institution study that students with massive amounts of debt totaling over $50,000 are actually only a small portion of the total number of Americans with student loan debt. It then goes on to point out a lot of evidence showing that the real problem with student loan debt are drop-outs who took out thousands of dollars in loans then never graduated with a degree. In other words, student loans are supposed to be an investment. You take on debt now with the promise that your future higher wages will be able to pay off the debt. However, students that don’t graduate end up taking on debt but never gaining the future wage benefit of having a college degree.
Ok ok ok, but the issues that Choire Sicha, author of the Awl article, had with the NYT was over the claim that “Only 7 percent of young-adult households with student debt have more than $50,000 in such debt.”
Sicha had a whole list of problems with the Brookings Institution study that came up with that 7% number. I’ll go through each of Sicha’s points and see which hold up to scrutiny.
Sicha’s Point #1:
Those aren’t households with people between 20 and 40; those are households headed by people between 20 and 40. Which is to say, this data excludes all people living in households headed by, say, their parents, or other adults. The way Brookings put this is: “households led by adults between the ages of 20 and 40.” Just another way to say it excludes all households led by anyone over 40! (Those households might be identical in student debt to “young” households! Or they might not? WHO KNOWS!)
This point is just factually untrue. According to the Brookings Institution report, they defined a young household as a household in which the “average age of adults is between 20 and 40.” In other words, a young adult living with his/her parents would still have their household included in the study as long as the average age of all the adults in the house was between 20 and 40. So, a household where a 20 year old is living with two 50 year old parents would have an average age of 40 and be included in the study. Depending on the ages of the former student and his/her parents, many such households will be included though many may be excluded. This is certainly a limitation of the data, but the Brookings Institution method to get around the data limitations by using average age is a perfectly legitimate way to study this type of data.
Sicha’s Point #2:
One effect of this age spread sample is that it includes college graduates from up to almost 20 years ago. This is literally not at all a study of college graduates of the last five years, or even ten years. We’re talking about people up to the age of 40, well into Gen X.
Right. This is not a study of recent college graduates. However, the data from the Federal Reserve Board’s Survey of Consumer Finances (SCF) has been collected in a consistent manner every 3 years since 1989. The survey collects something called cross-sectional data, which you can think of like a snapshot of the population at a certain point in time. By looking at how the snapshots change over time, we can see patterns that give us insights into how people with student loans today differ from people who had student loans 10 years or 20 years ago.
The only way this would be a serious problem is if you believe that there was some big change that happened in the last 5 or 10 years that would cause current student loan recipients to be very different from those just a few years before. Instead, all evidence suggest that changes in the number of students that take out student loans, the amounts of those loans, and the rising tuition rates have been changing steadily over time. Therefore, looking at a cross-sectional data analysis is a perfectly legitimate way to study this type of issue.
Sicha’s Point #3:
Also, in this survey, when there are multiple people in the household, the Brookings Institution simply divided the amount of college debt by number of people in the household. So one person’s $20,000 college debt becomes two people’s $10,000 college debt. This works out mathematically, of course, but not structurally.
Again, this point is just factually wrong. The Brookings Institution makes it very clear that the source of the numbers for the NYT chart showing that 7% of young adult households with student loan debt have debt exceeding $50,000 is talking about households as a whole, not average debt per person in the household. Elsewhere in the paper they do divide by the number of members of the household to get what they call a “mean per-person debt,” but that’s presented more as just one more way to look at the data rather than the way to look at the data. Regardless, the 7% number and the NYT chart that it comes from is the debt of the full household rather than the mean per-person debt of the household.
I get why Sicha got confused. David Leonhardt, the author of the NYT article, tweeted the following:
The worries are exaggerated: Only 7% of young adults with student debt have $50,000 or more. http://t.co/Aavawc8KpC
— David Leonhardt (@DLeonhardt) June 24, 2014
In other words, Leonhardt mixed up this point as well and tweeted factually incorrect information. In his tweet Leonhardt wrote “only 7% of young adults with student debt have $50,000 or more” when he should have written “only 7% of young adult households with student debt have $50,000 or more.” Although a careful reading of both the NYT article and the Brookings Institution study would have cleared this up, I can see why Sicha could get it wrong considering the author of the source piece confused this point as well.
Sicha’s Point #4:
And finally: The number of the people making up this data is quite small.
Is it too small? How do you know it is too small? Saying that the “sample size is too small” is a comment I hear quite often from well-meaning skeptics who are criticizing studies, usually without any explanation as to how they know it is too small. It turns out that in statistics there are statistical measures that tell you what sample size you need in order to make certain types of claims. Just saying it’s “too small” is meaningless if you cannot explain exactly why it is too small.
In a blog post criticizing Sicha’s article, Fredrik DeBoer took on Sicha’s statement that the sample size was not large enough, explaining exactly what is problematic about this statement.
This is something I’ve written about before– people dramatically overestimate the sample size needed to make responsible statistical conclusions. A sample size of almost 2,000 isn’t just big, it’s enormous. The standard error of a sample of this size will be very low. Absent systematic sampling bias (as opposed to error), the odds of the underlying population being significantly different from a sample of this size is tiny. Saying that it’s not a big sample just displays ignorance about the standards applied in statistical research.
Really though, you don’t need to be a statistician to know that the sample size probably was fine in this case. The data is not coming from some shady organization or some tiny study. The data comes directly via the Federal Reserve. It is officially collected by trained statisticians at the National Opinion Research Center (NORC)* at the University of Chicago and has a lot of oversight and transparency. You do not need to have a degree in statistics to know that all the other people with degrees in statistics are using this data and have not expressed concerns about the sample size so just maybe there isn’t actually a sample size issue here.
Sicha’s Point #5:
And finally… this survey is, essentially, of rich people. No, literally!
“We apply survey weights throughout the analysis so that the results are representative of the U.S. population of households. The use of survey weights is particularly important in the SCF because the sample design oversamples high-income households to properly measure the full distribution of wealth and assets in the United States. This high-income sample makes up approximately 25 percent of households in the SCF.”
Literally what they are saying there is that the information on which they are basing a sweeping assessment of American student loan debt is based on a sample in which 25% of those surveyed were “high-income households.” This is insane. (Update: I wanted to clarify that I get it that they are weighting this over-representation down to represent the population at large; that’s not my beef, entirely. Mostly I think it shows a further weakness in their non-rich sample at large.)
Ok, no. Just no. No. Oversampling small groups within the sample is not a weakness of a study, it’s a strength. As an example, let’s travel to Essos where Daenerys Targaryen, Mother of Dragons, has just taken over Meereen. To determine how well she’s doing as the new ruler of the city, she commissions a study to determine her approval rating among the people. Before Daenerys’ reign, the city had serious economic inequality with a small number of citizens owning the vast majority of the wealth (known as the Great Masters) and the rest of the people living in either poverty or enslavement. When Daenerys took over the city and outlawed slavery, it produced an upheaval with many Meereenese being better off, some whose lives are probably the same, and some, especially those who previous held power in the city, finding themselves powerless and with large losses in wealth. Obviously, the former Great Masters likely have a very different opinion of Daenerys’ rule than does the average citizen.
In Meereen about 1% of citizens were previously Great Masters and the other 99% are much poorer or are freed slaves. If we survey 100 people, we would expect to survey 99 regular citizens and 1 former Great Master. However, only surveying 1 former Great Master will not get us a very good sample of how the typical former wealthy citizen feels about Daenerys. Additionally, we don’t really need to survey 99 of the non-wealthy citizens to get an accurate picture of their opinions. It would make more sense for us in this case to oversample the former Great Masters and then reweight the data to match the population.
Instead of surveying 99 regular Meereenese and 1 former Great Master, we could survey 80 regular citizens and 20 former Great Masters. Once we get an accurate measure of Daenerys’ approval rating in each grouping, we can reweight the numbers to match the population. This would actually get us a more accurate measure of Daenerys’ approval rating than we would get just taking a random cross section of Meereen. It would also allow us a richer dataset so we could study how the opinions within each group differs even though one group may be much smaller than the other.
Oversampling small population groups within a larger population group is a measure of a good study with researchers who really know what they are doing. It is hardly “insane” but merely proves that the study researchers are making sure they get the most accurate and rich dataset that they can.
The only thing I can conclude from the Awl piece is that the author misunderstood many parts of the Brookings Institution study. I get that he is trying to disprove all the naysayers on twitter that are using the NYT article as evidence that there is no student loan crisis, but instead of attacking the underlying data he should be attacking their interpretation of the data. The underlying data is sound and contains a lot of evidence that there are some serious issues with student loan debt in the US. The NYT article mentions many of them, but it’s also worth mentioning that even the original piece of data that caused this kerfuffle, that 7% of households with student debt have over $50,000 in debt, is not exactly a small number. $50,000 is a lot of debt and over 1 in 20 households with student loans have debt higher than that amount. That sounds like quite a problem to me.
*For full disclosure, I had a professor who headed NORC during my grad school days and have had many friends who work or have worked for NORC.
Featured photo from @gameofthrones