Skepticism

The Takedown of the NYT Chart on Student Loan Debt Isn’t All it’s Cracked Up to Be

It’s no secret that I’m a sucker for articles pointing out why some giant study being reported on in the media is actually bunk. In fact, that’s pretty much the only thing I ever write about here at Skepchick. So, this week when I saw everyone on my facebook feed sharing an article from the Awl entitled “Gaslighting Millenials: That Big Study About How the Student Debt Nightmare Is in Your Head? It’s Garbage” which promised to be a takedown of a New York Times article, I immediately jumped on it.

I started out completely onboard with the Awl’s takedown of the Grey Lady herself, but the further I got in the article the more I started to wonder if perhaps the takedown itself was the one in the wrong. Although the author seemed well meaning, he also didn’t seem to have much statistics knowledge or understand what exactly makes a strong versus a weak study.

I followed up reading the Awl piece by reading the NYT article that it was criticizing. The Awl strongly implies that the NYT article is about how there are not really any issues with massive amounts of student loan debt in the US, whereas the NYT piece doesn’t actually say that at all. In fact, the NYT article is not about how there are not problems with student loan debt but only that the problems do not exist in the stereotypical cliché way we typically think of them. The cliché of students with massive amounts of debt, according to the NYT, are students who went to private schools and racked up tens or hundreds of thousands of dollars in debt. The article then provides evidence from the Brookings Institution study that students with massive amounts of debt totaling over $50,000 are actually only a small portion of the total number of Americans with student loan debt. It then goes on to point out a lot of evidence showing that the real problem with student loan debt are drop-outs who took out thousands of dollars in loans then never graduated with a degree. In other words, student loans are supposed to be an investment. You take on debt now with the promise that your future higher wages will be able to pay off the debt. However, students that don’t graduate end up taking on debt but never gaining the future wage benefit of having a college degree.

Ok ok ok, but the issues that Choire Sicha, author of the Awl article, had with the NYT was over the claim that “Only 7 percent of young-adult households with student debt have more than $50,000 in such debt.

Sicha had a whole list of problems with the Brookings Institution study that came up with that 7% number. I’ll go through each of Sicha’s points and see which hold up to scrutiny.

Sicha’s Point #1:

Those aren’t households with people between 20 and 40; those are households headed by people between 20 and 40. Which is to say, this data excludes all people living in households headed by, say, their parents, or other adults. The way Brookings put this is: “households led by adults between the ages of 20 and 40.” Just another way to say it excludes all households led by anyone over 40! (Those households might be identical in student debt to “young” households! Or they might not? WHO KNOWS!)

This point is just factually untrue. According to the Brookings Institution report, they defined a young household as a household in which the “average age of adults is between 20 and 40.” In other words, a young adult living with his/her parents would still have their household included in the study as long as the average age of all the adults in the house was between 20 and 40. So, a household where a 20 year old is living with two 50 year old parents would have an average age of 40 and be included in the study. Depending on the ages of the former student and his/her parents, many such households will be included though many may be excluded. This is certainly a limitation of the data, but the Brookings Institution method to get around the data limitations by using average age is a perfectly legitimate way to study this type of data.

Sicha’s Point #2:

One effect of this age spread sample is that it includes college graduates from up to almost 20 years ago. This is literally not at all a study of college graduates of the last five years, or even ten years. We’re talking about people up to the age of 40, well into Gen X.

Right. This is not a study of recent college graduates. However, the data from the Federal Reserve Board’s Survey of Consumer Finances (SCF) has been collected in a consistent manner every 3 years since 1989. The survey collects something called cross-sectional data, which you can think of like a snapshot of the population at a certain point in time. By looking at how the snapshots change over time, we can see patterns that give us insights into how people with student loans today differ from people who had student loans 10 years or 20 years ago.

The only way this would be a serious problem is if you believe that there was some big change that happened in the last 5 or 10 years that would cause current student loan recipients to be very different from those just a few years before. Instead, all evidence suggest that changes in the number of students that take out student loans, the amounts of those loans, and the rising tuition rates have been changing steadily over time. Therefore, looking at a cross-sectional data analysis is a perfectly legitimate way to study this type of issue.

Sicha’s Point #3:

Also, in this survey, when there are multiple people in the household, the Brookings Institution simply divided the amount of college debt by number of people in the household. So one person’s $20,000 college debt becomes two people’s $10,000 college debt. This works out mathematically, of course, but not structurally.

Again, this point is just factually wrong. The Brookings Institution makes it very clear that the source of the numbers for the NYT chart showing that 7% of young adult households with student loan debt have debt exceeding $50,000 is talking about households as a whole, not average debt per person in the household. Elsewhere in the paper they do divide by the number of members of the household to get what they call a “mean per-person debt,” but that’s presented more as just one more way to look at the data rather than the way to look at the data. Regardless, the 7% number and the NYT chart that it comes from is the debt of the full household rather than the mean per-person debt of the household.

I get why Sicha got confused. David Leonhardt, the author of the NYT article, tweeted the following:

In other words, Leonhardt mixed up this point as well and tweeted factually incorrect information. In his tweet Leonhardt wrote  “only 7% of young adults with student debt have $50,000 or more” when he should have written “only 7% of young adult households with student debt have $50,000 or more.” Although a careful reading of both the NYT article and the Brookings Institution study would have cleared this up, I can see why Sicha could get it wrong considering the author of the source piece confused this point as well.

Sicha’s Point #4:

And finally: The number of the people making up this data is quite small.

Is it too small? How do you know it is too small? Saying that the “sample size is too small” is a comment I hear quite often from well-meaning skeptics who are criticizing studies, usually without any explanation as to how they know it is too small. It turns out that in statistics there are statistical measures that tell you what sample size you need in order to make certain types of claims. Just saying it’s “too small” is meaningless if you cannot explain exactly why it is too small.

In a blog post criticizing Sicha’s article, Fredrik DeBoer took on Sicha’s statement that the sample size was not large enough, explaining exactly what is problematic about this statement.

This is something I’ve written about before– people dramatically overestimate the sample size needed to make responsible statistical conclusions. A sample size of almost 2,000 isn’t just big, it’s enormous. The standard error of a sample of this size will be very low. Absent systematic sampling bias (as opposed to error), the odds of the underlying population being significantly different from a sample of this size is tiny. Saying that it’s not a big sample just displays ignorance about the standards applied in statistical research.

Really though, you don’t need to be a statistician to know that the sample size probably was fine in this case. The data is not coming from some shady organization or some tiny study. The data comes directly via the Federal Reserve. It is officially collected by trained statisticians at the National Opinion Research Center (NORC)* at the University of Chicago and has a lot of oversight and transparency. You do not need to have a degree in statistics to know that all the other people with degrees in statistics are using this data and have not expressed concerns about the sample size so just maybe there isn’t actually a sample size issue here.

Sicha’s Point #5:

And finally… this survey is, essentially, of rich people. No, literally!

“We apply survey weights throughout the analysis so that the results are representative of the U.S. population of households. The use of survey weights is particularly important in the SCF because the sample design oversamples high-income households to properly measure the full distribution of wealth and assets in the United States. This high-income sample makes up approximately 25 percent of households in the SCF.”

Literally what they are saying there is that the information on which they are basing a sweeping assessment of American student loan debt is based on a sample in which 25% of those surveyed were “high-income households.” This is insane. (Update: I wanted to clarify that I get it that they are weighting this over-representation down to represent the population at large; that’s not my beef, entirely. Mostly I think it shows a further weakness in their non-rich sample at large.)

Ok, no. Just no. No. Oversampling small groups within the sample is not a weakness of a study, it’s a strength. As an example, let’s travel to Essos where Daenerys Targaryen, Mother of Dragons, has just taken over Meereen. To determine how well she’s doing as the new ruler of the city, she commissions a study to determine her approval rating among the people. Before Daenerys’ reign, the city had serious economic inequality with a small number of citizens owning the vast majority of the wealth (known as the Great Masters) and the rest of the people living in either poverty or enslavement. When Daenerys took over the city and outlawed slavery, it produced an upheaval with many Meereenese being better off, some whose lives are probably the same, and some, especially those who previous held power in the city, finding themselves powerless and with large losses in wealth. Obviously, the former Great Masters likely have a very different opinion of Daenerys’ rule than does the average citizen.

In Meereen about 1% of citizens were previously Great Masters and the other 99% are much poorer or are freed slaves. If we survey 100 people, we would expect to survey 99 regular citizens and 1 former Great Master. However, only surveying 1 former Great Master will not get us a very good sample of how the typical former wealthy citizen feels about Daenerys. Additionally, we don’t really need to survey 99 of the non-wealthy citizens to get an accurate picture of their opinions. It would make more sense for us in this case to oversample the former Great Masters and then reweight the data to match the population.

Instead of surveying 99 regular Meereenese and 1 former Great Master, we could survey 80 regular citizens and 20 former Great Masters. Once we get an accurate measure of Daenerys’ approval rating in each grouping, we can reweight the numbers to match the population. This would actually get us a more accurate measure of Daenerys’ approval rating than we would get just taking a random cross section of Meereen. It would also allow us a richer dataset so we could study how the opinions within each group differs even though one group may be much smaller than the other.

Oversampling small population groups within a larger population group is a measure of a good study with researchers who really know what they are doing. It is hardly “insane” but merely proves that the study researchers are making sure they get the most accurate and rich dataset that they can.

The only thing I can conclude from the Awl piece is that the author misunderstood many parts of the Brookings Institution study. I get that he is trying to disprove all the naysayers on twitter that are using the NYT article as evidence that there is no student loan crisis, but instead of attacking the underlying data he should be attacking their interpretation of the data. The underlying data is sound and contains a lot of evidence that there are some serious issues with student loan debt in the US. The NYT article mentions many of them, but it’s also worth mentioning that even the original piece of data that caused this kerfuffle, that 7% of households with student debt have over $50,000 in debt, is not exactly a small number. $50,000 is a lot of debt and over 1 in 20 households with student loans have debt higher than that amount. That sounds like quite a problem to me.

*For full disclosure, I had a professor who headed NORC during my grad school days and have had many friends who work or have worked for NORC.

Featured photo from @gameofthrones

Jamie Bernstein

Jamie Bernstein is a data, stats, policy and economics nerd who sometimes pretends she is a photographer. She is @uajamie on Twitter and Instagram. If you like my work here at Skepchick & Mad Art Lab, consider sending me a little sumthin' in my TipJar: @uajamie

Related Articles

13 Comments

  1. That was my objection to the study- you don’t need debt greater than $50,000 for it to be a huge burden on your life. Even “only” $10,000 in debt can be a massive burden if you don’t manage to find a job or if you are underemployed.

    1. It’s especially problematic for those who took out the loans and then for whatever reason never ended up getting their degree. Now they have a huge debt, none of the benefits that come from a college degree, and likely also missed out on a couple years of work. It’s a problem all around.

      The Awl author is correct that the people that are trying to point to this study to prove that student loan debt is not an issue for families are completely and utterly wrong. He just went about it in the wrong manner.

      1. Huh. No, it goes beyond just those that don’t get the degree. I got mine, and still ended up working a crap job, because I had no “job experience” **at all**, and couldn’t even, at the time, get my foot in the door at the bottom, never mind at the level my supposed degree was supposed to prepare me for. And, that is the absurd thing, really. WTF good does it do to know how to write code for DB management and other tasks, if the only job you can get, from the start, is data entry, where you spend the first who knows how many years, while the technology, languages, etc. all outpace you, just typing things into the DBs, not actually designing or maintaining them? And, that is without even mentioning the uselessness of a degree that is basically just about maintaining and designing databases, because silly shit, like knowing the latest OSes, or writing code to do **anything** else is, “Unnecessary for the carrier path.”, according to the lousy place you went to. I am not sure how much of any other similarly narrow, or even wider ranged, degrees would have been an better, but then, that is because I didn’t get one of those. What I did get, by the time I was done, was no immediate job, and a loan that, by the time I could finally pay it off, had ballooned to $25k. Not amusing at all.

          1. I wasn’t suggesting that you where implying this, just pointing out, maybe too strongly?, that there are a lot of people without the benefits of their degrees, whether they got them or not, and that, therefore, it is, on some level, meaningless if the study itself was or wasn’t done in a way to prove his point about families somehow being better off. The problem is that his point is just wrong, period. Just about anything is “less of a burden” if you have a lot of different people working together to pay it off. This doesn’t mean, at all, that the debt in question isn’t still unreasonable. Its merely much, much, much, worse for people that either can’t get jobs in the field their degree comes from, or never get one at all (both of which are functionally equivalent situations).

  2. I agree with most of your points about the Awk article, but I have one nit to pick. When you talk about average age of adults in a household, you’ve really only pointed out that some households are barely included. A 24 year old living with parents in their 50s would still be excluded, and it’s reasonable to ask if this produces a systematic bias due to the number of “Boomeranger” households having increased, precisely because young people have had trouble gaining financial independence.
    I also want to point out that, although the Awk article is probably wrong to criticize the study for basing it’s conclusion about student loan debt on 1992 comparisons, since tuition hikes have been going on for decades, the same is not true when talking about income and the burden of monthly payments. The Great Recession very harshly impacted inexperienced workers, which has lasted all the way to the present day, so when you talk about a recent student loan debt crisis, you really want to treat people who have graduated in the past 6-7 years almost as being in a different category. The study really just doesn’t have the data to do that, since it only has one data point after the recession, and that point lumps together post-recession graduates with pre-recession graduates who had the chance to land a decent job before layoffs during the financial crisis flooded the economy with more experienced workers.

      1. I totally agree with you. The data is certainly not perfect. However, because all they had to work with was household data, I think the way the researchers decided on what constituted a young household was a good method with the imperfect data they were using.

        Certainly if there is a big jump in differences in the last 5 years, it would definitely mean that recent graduates should be put in a separate category. I’m not sure that’s he case (most the evidence I’ve seen has made it look like the changes are more gradual). Even so, the study was never meant to look at just recent graduates but look at the state of student loan debt as a whole. Of course it would be bad for looking at recent graduates, but that’s not what it’s doing.

        Oh, and I keep wanting to call it the Owl rather than the Awl. I had to keep correcting it as I typed the article.

  3. To be fair, anyone who makes those kinds of errors either had a daddy who gave them not only a new building but a new campus, or never went to college.

    Another thing I thought of was that he sounds like he’s all “Psh, $50k, I make more than that in a week.”

  4. “As an example, let’s travel to Essos where Daenerys Targaryen, Mother of Dragons, has just taken over Meereen.” I assume this means something to fans of Game of Thrones(?) or something but I have no idea.

    1. Don’t worry, it’s actually really OOC for Dany to change her mind; she’s very headstrong and idealistic, which will cause her problems in the near future.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to top button