Proving and Quantifying Sexism

In many areas of life, especially in cases where a person must apply to enter an organization, field or job, there are large gender gaps. Whenever feminists call for equal representation by women in a particular endeavor, detractors often claim that adding more women means better-qualified men will get bumped down. They claim that in order to get an equal number of women and men, the bar must be lowered on quality. For example, some in the skeptic movement have claimed that adding more women speakers at skeptic and atheist conferences means replacing more qualified men with less qualified women.

As a feminist, I don’t believe this is true. I believe that increasing representation of women will increase the quality of the endeavor. I also believe  the thing keeping women from certain fields in which they are under-represented is institutionalized sexism. As a social scientist though, I don’t just want to believe these things. I want to prove them and I want to measure them. To do that, we have to find something measurable that would be an effect of institutionalized sexism.

First, lets create a game theory model to come up with a hypothesis of how the world should look if there is sexism present. Then, we’ll go out into the real world to see if we can find examples of it in action.

Lets say we’re a conference and we’re putting together a panel of experts. We want 6 people on our panel but we have 10 candidates: five women and five men. The men and women have varying qualifications for the panel, which we’ll rate on a scale of 1-5 (with 5 being the most qualified). Here they all are, with the women represented in purplish-pink and the men in blue (because if we’re already assuming only two genders exist for this problem, we might as well go with the cultural representations of those genders). The numbers represent their qualification rating. Two individuals with the same rating are considered of equal qualification (so the woman rated as 4 is exactly as qualified as the man rated as 4).

 Quantifying Sexism Figure 1

In a world with no sexism, it’s really simple to choose the 6 individuals for our panel. The most qualified panel would consist of the following individuals:

Quantifying Sexism Figure 2

This panel is ½ men and ½ women. The men on the panel have an average qualification of 4 and so do the women. The average qualification of the entire Board is 4. Even if we assume slight sexism where men are always chosen above women of exactly equal ability, we would still end up with our three men and three women panel.

Now, lets change things up a bit. Lets assume there is some institutionalized sexism and women are perceived to be one qualification level lower than their actual level. Additionally, whenever a man and woman are perceived to be of equal ability, the man is always chosen before the woman. Now, our panel will look a little something like this:

Quantifying Sexism Figure 3

We now have have a panel that consists of 2/3 men and 1/3 women. The average qualification score of the men on the panel is now 3.5 (a 0.5pt decrease) and the women’s is 4.5 (a 0.5pt increase). The average qualification of the entire panel has dropped from 4 in our gender-neutral world to 3.8.

We now have a testable hypothesis!

In any grouping that is supposed to consist of the most qualified people and has a large gender imbalance, if that gender imbalance was caused by institutionalized sexism either in the choice of individuals or the admittance of individuals into the pool of candidates (for example, discouraging women from studying in STEM fields), then the women in the group will be more qualified on average than the men in the group. Additionally, rival groups with more equal representation of women will be more qualified on average than rival groups with fewer women.

Now that we have a hypothesis, lets go out into the world to see if it holds true. Luckily, we don’t need to do this ourselves because other scientists have already looked into this. Here are three studies that show that this hypothesis holds water:

  • The boards of Fortune 500 companies in the US consist of a mere 16% women. But, within those 500 companies, there is a lot of variation.  According to our hypothesis, if the lack of women on boards of large corporations is due to sexism, then boards that contain more women should be, on average, better than Boards that contain fewer women. A recent study shows that this is in fact the case and that the stocks of companies with more women on their Board do better, on average, than their less equal counterparts.
  • Only 18% of hedge funds on Wall Street are run by women. If women are just as capable as men and the imbalance is due to institutionalized sexism, then the woman-run hedge funds should do better on average than the man-run hedge funds. Again, a recent study shows that hedge funds run by women produce an average return of 9% while the industry standard is a mere 3%.
  • The current 113th Congress will contain a record 20 women in the Senate and 78 in the House for a grand total of 18%. According to our theory, if the gender imbalance is caused by voter sexism, these 98 women should be better politicians than their male counterparts. Since the 113th Congress is new, we don’t have stats on them yet, but a study in 2011 found that Congresswomen outperformed Congressmen by bringing more federal money to their districts and sponsoring and co-sponsoring more bills, even after controlling for party affiliation.*

If you know of any other similar studies that show women outperforming men in fields that lack diversity, please leave a note in the comments. I’d really love to learn about more studies of this type.

Institutionalized Sexism isn’t just some crazy idea feminists came up with to try to get qualified men fired and replaced by unqualified women. Evidence out in the real world shows that in areas which contain large gender imbalances, the women are generally better at their jobs than the men. Since we don’t have any good reason to believe that women are somehow just innately better than men, then it means that sexism present either in the choices of individuals or in the pool of individuals that the choices will come from.

MRA’s often say that feminists think that women are better than men. In actuality, the more sexism that is present, the more women will be better than their male counterparts. When gender diversity increases in a group, we should see the average ability of women in the group decline to be more equal to that of the men and the average quality of the entire group rise.

If we apply this to skeptic and atheist conferences, which often contain far more male speakers than female, increasing the number of women speakers should increase the overall quality of the conference. In the STEM fields, which tend to lack women, increasing the number of women will raise the quality of the scientists graduating with STEM degrees. In large companies that often lack female leadership, companies that strive to have more gender-balance will be more likely to out-compete their competitors. Once you get a hang of the idea that diversity=quality, you start to see potential for gains everywhere.

*Full Disclosure: The study author Christopher Berry was one of my graduate degree professors.

Jamie Bernstein

Jamie Bernstein is a data, stats, policy and economics nerd who sometimes pretends she is a photographer. She is @uajamie on Twitter and Instagram. If you like my work here at Skepchick & Mad Art Lab, consider sending me a little sumthin' in my TipJar: @uajamie

Related Articles


  1. Interesting hypothesis. Makes sense. You might also want to also put a call out for studies that don’t support this hypothesis – at least if you’re going for any sort of rigor. My first thought when reading the list of studies was “This would be really easy to cherry-pick.”

    1. Definitely. I’m just not sure that people report situations like “women and men hedge fund managers are exactly the same!” Also, if that were the case, it wouldn’t necessarily mean the model is wrong. It would mean, though, that there are factors other than sexism causing the difference. The model only really works when the only thing causing the gender imbalance is sexism.

      But, I would be interested to see cases in which men and women in a gender imbalanced endeaver were exactly the same. That would be interesting and mean that there was something else interesting going on. I’m just not sure these cases are reported as often.

      1. Well I’d guess the causation for outperformance is that the women have to fight much harder to get to these positions, and so the individuals who are willing to put up such a tenacious fight are tenacious people in general who will create great results.

        I think if institutional sexism and all the other unique things that stress women professionally did not exist, you might see the numbers even out.

    2. Cherry picking is the stat\ed goal: “If you know of any other similar studies that show women outperforming men in fields that lack diversity, please leave a note in the comments. I’d really love to learn about more studies of this type.”

      1. Please, you out of all people accusing people of having a pre-concieved bias? That’s hilarious.

  2. Good stuff here, though I have to wonder how one rates a politician.

    The closest I can offer, and this really has no studies but rather an impression, is Linda Greenlaw.
    AFAIK, she’s the only female swordboat captain. She manages to do her job just as well as her male counterparts.
    Not really a case of sexism, but rather how well a woman does in a male dominated field.

    1. I’m at work right now so I can’t really spend much time looking into it, but the article I linked to on the study with Congressmen vs Congresswomen does have a link to the original paper. It’s been over a year since I read it myself so I don’t want to comment on their methods without a re-read.

  3. Couldn’t skeptical and atheist meetings themselves be an example of this? First we can compare historical meetings to current meetings, in the past women were underrepresented in talks and panels at many meetings. Since efforts have been made to include more women as speakers, the quality of the conferences should have gone up. If we take TAM as the prime example, more recent meetings, with more women speakers and attendees, have had the best attendance. Of course, we have general growth as a confounding variable here, but it’s interesting nonetheless. It would be more interesting if we had measures for actual panel discussions. Which were the best attended? Which were the best reviewed?

    1. Gender imbalance in the Skeptic and Atheist community is one of the reasons I wrote this article. A thing I hear a lot is “Oh, we wanted to have more female speakers, but we are adding speakers based on qualifications only, so if there aren’t any women it doesn’t mean we are sexist. It just means there weren’t any qualified women.”

      This is the same thing you hear a lot in companies, for example. “Oh, we want more women in leadership positions but there just aren’t any that are qualified enough.” But, if this were the case, then when we compare the men to the women in the positions they should be equally as qualified. The fact that the women are more qualified than the men means that there was some kind of sexism that crept into the process.The numbers don’t lie.

      I would love to see some data on skeptic and atheist communities, but it would be really difficult to get enough data or to really meassure how “good” speakers or members are. It’s far more difficult than comparing stock prices.

      1. I read a fascinating blog post a few weeks ago that described how a technology conference achieved a female speaker rate of 25% despite only 10% of proposals being submitted by women. The secret? Their call for presentations stated that talks would be chosen anonymously and they did outreach to encourage women to submit proposals. Perhaps the approach might be useful in the Skeptic and Atheist community? Details of their methodology are at http://2012.jsconf.eu/2012/09/17/beating-the-odds-how-we-got-25-percent-women-speakers.html and I highly recommend checking it out.

      2. As the organizer of the Women in Secularism Conference, I can tell you that there are just as many qualified women speakers as men. My problem is narrowing my choices for the conference every year. I could easily fill a four day conference and turn down dozens of qualified speakers. The problem is that many conference organizers would rather go with their old standbys than do some research to find fresh faces.

  4. Thanks for the stock market tip. I’m going to be rich!
    Seriously though, the effect you speak of is even greater when you consider the impact of experience lost when women are denied the opportunity to practice at their optimal level in their chosen field.
    Public speaking is a prime example of practice being essential to performance. If the skeptic movement wants to shoot itself in the foot, one really excellent way would be to deny women the opportunity to develop their skills.
    I for one will not tolerate a New Dark Age of the Shitlords!.

    1. Agreed. These things snow-ball. Sexism in the way men and women are treated in STEM fields, for example, happens when we are just children. Because women are discouraged, the bar is set higher from a young age and women drop out. By the time an Engineering Firm is looking to high college grads, there are not many women left to choose from, but those that are in the pool of candidates will likely be on average of better quality than the men in the pool.

      1. Yes, even a slight imbalance at each level means that at the top of the field, there could be a huge imbalance even if the selection process at the highest level is rigorously meritocratic. The result is that there will be fewer women than men at the top level of any field affected in this way, and any project looking for top-level performers will have more men to choose from. However, the women at that level will necessarily be better on average than their male counterparts, so without knowing the details of exactly how good any top-level individual is, it makes more sense to choose a woman than a man. It would therefore be both unfair and illogical to simply choose at random or based on a strict ratio. There should in fact be a deliberate bias in favour of women. Arguably, no men should be chosen until all the available women have been selected (unless there is specific information indicating an individual man is more qualified than an individual woman). This is an odd conclusion (and not one I intended coming to) but seems logically sound.

        1. Ok, a couple things. First of all, I made a mistake in thinking about the pool of candidates the same as choosing the candidates. In fact, if the weeding out was happening at an earlier stage and the later choice was based on perfect meritocracy, we would end up with a field in which the women are of equal qualification to the men, even if there are more men. If we see that the women are at equal qualification as the men, it would be a sign of sexism in the final choices, regardless of whether or not there was sexism involved in entering the pool of candidates. Sorry for stating the opposite earlier. I had not thought about it clearly.

          Secondly, you have to keep in mind that the outcome would be that the AVERAGE quality of women will be higher than that of the men if sexism is present. However, it doesn’t mean that any individual woman is more qualified than any individual man. However, working backward, if all you knew about a group was the gender of the individuals and that sexism against women was involved and you were forced to make a choice, because the women are higher quality on average than the men, it would make sense to choose a woman.

          But, you have to be careful with that too. My assumption was that difficulty of entering a field for women would mean that only the best of the women would choose to enter the field. But, this is just a conjecture. It is also possible that women are discouraged based on a factor like personal grit rather than their intellect. In which case, there would be less women in the pool, but the quality of those women would be equal to men. In that case, you would not necessarily want to choose all the women before the men.

  5. Just wondering how this applies specifically to skeptical and perhaps atheist conferences/panels etc. I’m aware of the atheist data (and I assume with some confidence that the skeptical population overlaps heavily with the A pop) which shows for every country men out number women more than 2:1. If you break down the numbers in many different ways, for example attendance, belief and so on, the proportion remains consistently bigger than 2:1. There are some interesting hypotheses about this such as men having a more ‘autistic’ mind which makes it more difficult to believe but the relevant point for this discussion that this lopsided demographic exists and is immune to cultural and geographic boundaries and so forth. I don’t know if my logic holds, but if we assume (quite rightly I would assert) that all things being equal that male atheists are as competent as female atheists then if we see a panel that has twice as many men to women then we can assume that that panel has maximized it’s strength and shows no signs of institutional sexism? If the panel is closer to 4:1 men to women or worse then we would suspect sexism. Am I right in this thinking?

    1. Sorry, Spencer. Your logic doesn’t hold. The lopsided demographic may be consistent across cultures, but all you can say (if this hypothesis is true) that groups with a 2:1 ratio might perform better (whatever that means) than groups with a 4:1 ration. But you’d have to compare groups with 2:1 ratio to groups of 1:1 ratio to see if a panel has maximized its strength.

      We’re not talking about how well panels reflect the demographics of a group. We’re talking about the success of organizations that have 50/50 leadership. If a group’s membership is 2:1 men to women, we still suspect sexism. If men and women atheists are equal, then some other factor is keeping women away.

      1. Yes, exactly what Karenx said. If there was some inate difference, like men being more “skeptically minded” than women, then when comparing the men in the group to the women, their qualifications levels should be exactly the same. The fact that in the endeavers that we were able to meassure outcomes the women out-perform the men leads to the conclusion that sexism is present.

    2. // I’m aware of the atheist data (and I assume with some confidence that the skeptical population overlaps heavily with the A pop) which shows for every country men out number women more than 2:1//


      1. If I’m reading these statistics correctly, and assuming “unaffiliated” is closely linked to atheism, the gender difference seems quite a bit smaller than a 2:1 ratio.

        In the context of this discussion I have trouble considering the notion that that this lopsided statistical result is in and of itself an indication of sexism given a reported lack of theistic beliefs is not by any definition I’m aware of tantamount to having membership in an organization. I fully appreciate the value of diversity and of recognizing and addressing entrenched sexism and disparity within organizations, however I sometimes wonder if some (many?) men are more willing to identify themselves as being an atheist because it makes them something of a philosophical bad ass outlier compared to most folk; which might make the atheist or even skeptic label less appealing to some women despite it being an accurate representation of their thinking. I suppose it may be similar to the many people went along with the temperance and prohibition movement, not because they thought drinking alcohol was a bad thing, they just didn’t want to look like they supported the sin of drunkenness. One bit of research I saw hinted at the male propensity for impulsivity and men’s lower rates of long term planning which would mean fewer men would be worried about going to hell. I can only hope that particular scientist understands that many men and women abandon their religious beliefs after engaging in rational non impulsive thinking that resulted in a rejection of supernatural claims.

      2. Probably atheistcensus.com The problems with relying on that as a source are obvious. It’s for people who self-identify as atheists and is publicised through atheist and skeptic channels. If they have a skewed ratio, so will the census.
        In the Irish census of 2011, the ratio of male to female “atheists” was around 2:1 while among those professing “No Religion” (a much larger number) it was around 3:2
        This suggests there is an inherent gender bias but that could still be influenced by other factors.

  6. I’d go even farther than this. While the above analysis is correct, if we are faced with a situation where the men *actually are* better than women, that does not mean that institutional sexism is *not* the reason.

    UAJamie said:

    In any grouping that is supposed to consist of the most qualified people and has a large gender imbalance, if that gender imbalance was caused by institutionalized sexism either in the choice of individuals or the admittance of individuals into the pool of candidates (for example, discouraging women from studying in STEM fields), then the women in the group will be more qualified on average than the men in the group. Additionally, rival groups with more equal representation of women will be more qualified on average than rival groups with fewer women.

    While this is true at the outset, most skills involve significant experience and/or training. If institutionalized sexism plays into who receives such experience and training, then there will be more qualified men in the field than women. The women in the group, on average, will be more qualified than the men if and only if general aptitude if the deciding factor about who is able to overcome the institutionalized sexism and enter the field. That may not always be the case. There could be factors such as independence, stubbornness, rebellion, ethics, etc. which result in a woman bucking the patriarchy and entering a “male” domain, but which don’t actually make her more better in the given position.

    In that situation, men and women who receive the same training and experience would be at equal skill levels. However, if you have 10 men and 5 women, picking the best six candidates would get you four men and two women. Assume the same distribution as above. You’ve got ten men: two 5’s, two 4’s, etc., and five women, one 5, one 4, etc. Your best pool of six would be M5, M5, F5, M4, M4, F4 – four men and two women.

    In that scenario, the position offered is truly meritocratic, and sexism is not entering into the analysis, but institutionalize sexism is still to blame for the imbalance.

    Applied to speakers at conferences, an organizer could be deciding on speakers based on experience level with other conferences, which is a facially gender-neutral and relevant criterion. However, even applying this gender-neutral standard will result in the systematic exclusion of women because of women’s historical difficulty gaining experience speaking at conferences.

    Thankfully, I doubt that experience is really all the relevant in the quality of a speaker, so my guess would be that, if one could come up with a way to objectively rate the quality of speakers at conferences, it would follow the above trend, where the women would generally score higher than the men. But even if that’s not the case, institutionalized sexism could still be the reason, and changes would still need to be made to have an inclusive space.

    1. Yes! I was thinking about it this afternoon and realized I was wrong in assuming that sexism in creating the pool of candidates would create the same imbalanced quality outcomes. You are right that this is not the case. In fact, even if sexism exists in creating the pool (for example, in discouraging women from being involved in the Atheist community), as long as the choice of winners from the pool was based only on meritocracy, we would NOT see a quality difference between men and women. However, if we do see a gender quality difference, it is the result of sexism in the choice of winners, regardless of what the original pool of candidates looks like. I actually like this better because it removes the excuse that the sexism happened earlier in the process and not in the final decision.

  7. There is a slight issue I see here which is that there are two conflicting underlying issues which could give the same effect: as stated, there could be sexism suppressing the general perception of women’s qualification level, and I’ve seen enough to think this happens. But people who argue sexism isn’t an issue believe that the higher profile given to men means that they must be more qualified: they must be better if people talk to them, about them or whatever more often. In the real world equal qualification is harder to measure. This doesn’t really address that argument at all.
    Not that i think people have a duty to counter all arguments in one go, however reading this I did think “Yes but you can never demonstrate equal competence in the real world”, and we’d see the same effect if the pool of female candidates was actually less qualified – for sexist or other reasons – as if the perception of qualified candidates was artificially suppressed by sexism.
    Having real issues with the mobile interface here so sorry if words have randomly vanished or appeared somewhere inappropriate…

    1. The difficulty of measuring “quality” is why studies that look into this phenomenon typically look at performance outcomes. The first study I mentioned looked at stock prices. The second looked at hedge funds. The third looked at number of times sponsoring bills and money brought home to their district. Obviously, none of these are perfect, but they are “good enough” for getting a fairly unbiased measurement of quality. Obviously, trying to do this with speakers at skeptical conferences would be extremely difficult. There are just going to be areas in which we are unable to come up with easily measurable characteristic to prove the phenomenon.

  8. So excited you are here with us Jamie!! Here’s another bit of data for you:
    PLoS One. 2012;7(11):e49682. doi: 10.1371/journal.pone.0049682. http://www.ncbi.nlm.nih.gov/pubmed/23185407#
    Stag parties linger: continued gender bias in a female-rich scientific discipline. Isbell LA, Young TP, Harcourt AH.
    Male-organized symposia have half the number of female first authors (29%) that symposia organized by women (64%) or by both men and women (58%) have, and half that of female participation in talks and posters (65%).
    Additional coverage of all male panels in technical conferences here, and a pledge to not participate in panels that don’t reflect the diversity of a field: http://www.timeshighereducation.co.uk/story.asp?sectioncode=26&storycode=422408&c=1

    It’s not just a skeptical thing–I was not invited to speak at symposium at the Entomological Society National Meeting in 2012 …but 3 different presenters (MALE) who were invited, but relatively new to the field, called me to ask what they should say.
    It is a relatively modest statement of fact to say that I am -the- entomology social media expert. (My Klout score is 75, if anyone actually cares).

    I get quite crabby when people talk about there being “no qualified women” (or minorities) to talk at skeptics meetings. Bull.shit.
    It’s even more BS when someone like Shermer makes the claim that women just aren’t interested. Being “intellectually active” is “a guy thing.” I know an awful lot of women that beg to differ. In fact, I know many amazing women and people of color that also don’t get the call to speak.

    Don’t pick me?–fine. I have plenty of other stuff to do, even if you reject me.
    But don’t pick any of the many other amazing folks available? For lame ass reasons? That is unacceptable.

    1. Oh, I really love that second link. My favorite line from it: “Getting a gender-balanced panel isn’t a sign of filling quotas…or other such absurd accusations, it’s a sign of a functioning meritocratic process.”

      I can’t stand when people say they aren’t sexist, there is just no qualified women. Ugh! What I love about this model is that as long as you can find an unbiased way to measure quality, you can prove that sexism is playing a factor. I would love to see similar studies done looking at the quality of ethnic minorities in fields in which they are underrepresented. My guess would be that we would see the same quality imbalance because racism was present in the selections.

  9. There is also experimental evidence that, holding skill constant, women are less likely to select into competitive environments as individuals ((http://qje.oxfordjournals.org/content/122/3/1067.abstract; also in the interest of full disclosure one of the authors is a current PhD advisor of mine; http://onlinelibrary.wiley.com/doi/10.1111/j.1468-0297.2010.02409.x/abstract?deniedAccessCustomisedMessage=&userIsAuthenticated=false). As a result, if preferences or underconfidence means that only the 5s and 4s amongst the women are willing to participate, then the differences observed could be due to the preferences/underconfidence of females and not necessarily to the sexism of the process, no?

    Now you can always argue that even that explanation ultimately derives from historical sexism breeding differences in confidence and preferences for competition, but the policy implications become much less clear. Affirmative action solves the supply side problem (institutional sexism), but not the demand side one (preferences/confidence).

    All in all, differences in average ability between the sexes are few and far between and fostering equality in representation across a broad range of endeavors is a worthwhile goal. But the precise nature of the cause of the existing gaps needs to be understood before we can start thinking about how to best address it.

    1. Actually, if you rerun the model and change the pools of candidates, you’ll find that the model still holds.

      Lets first change the quality levels of the original pools so we have 5M 4M 3M 2M 1M and 5F 5F 4F 4F 3F. Now the pool of women is more qualified than that of the male pool. When we unbiasly select our 6 winners, we get 5M 5F 5F 4M 4F 4F. We have 2/3 women and 1/3 men. The avg qualification of the women is 4.5 and the average qualification of the men is 4.5. Therefore, according to our model, sexism doesn’t seem to of played a role in choosing the individuals for the panel. Additionally, a better qualified female pool resulted in MORE women on our panel, not less.

      But, obviously the pools of men and women are still the same size. So, lets now double our male pool to represent more men. We now have the following individuals in our pool: 5M 5M 4M 4M 3M 3M 2M 2M 1M 1M and 5F 5F 4F 4F 3F. The male pool is more spread out quality wise but there are more of them. The female pool is limited to only higher quality individuals but is much smaller. The panel winners we choose from this pool if sexism is NOT present would be the following: 5M 5M 5F 5F 4M/4F 4M/4F. In the last two slots we could end up with 2 men, 2 women or 1 man and 1 woman. Statistically, we’re most likely to end up with 4M and 4F. This would result in a panel of 1/2 men and 1/2 women with the men and women of equal qualifications. If 2 men are chosen at the end, it would then make the men have a avg qualification level a bit lower than the women. But, importantly, if we made many such panels, the average over the panels between the men and women would still be equal assuming there is no sexism present. If over many panels we see a qualification difference between the genders, we can assume there is sexism present.

      1. Yes, but what the number of women is a binding constraint, i.e. you can’t fill out your panel in an unbiased fashion without exhausting the supply of (willing) women. Or put another way, to fill out your panel in an unbiased fashion, you would need to conscript or coerce women. Let’s say you have 5M 4M 3M 2M 1M and 5F 4F (with 3F 2F and 1F existing, but opting out of the competition). Now the group selected without prejudice is 5M 5F 4M 4F 3M 2M. This results in a grouping that is supposed to consist of the most qualified people and has a large gender imbalance but there is no institutionalized sexism. Still, the average quality of the women participating is greater than the men. So the evidence presented could be consistent with this mechanism as well, no? The only change in the model is that the candidates have to opt-in to the process and therefore there are demand-side factors at play.

        Incidentally, like ceolaf, I too believe that institutionalized sexism is likely to be fairly commonplace and a big contributor to the imbalances observed above. Still, I think that there are more mechanisms consistent with this evidence than the one presented here (including the ones raised by ceolaf) that also contribute and may be less simple to solve.

        1. Yes, yes and yes!! In there is a binding constraint on the number of individuals in the pool, you can end up with something that looks just like our sexism version of the model even if there was no sexism present in the decision-making process. Therefore, this would be a bad model to use in an area where there was a very limited number of individuals in the pool. Like, lets say you were creating a panel consisting of only individuals in a extremely rare field where there are only like 10 people in that field that exist. You’re going to end up with some super weird results that break the model.

          As you mentioned, in areas in which there is an opt-in process this excuse is used a lot (not are fault there are no women because no women applied!). But, usually the opt-ins come from a bigger pool of all potential qualified individuals of which the opt-ins are a subset. In the end, it means that outreach to women to encourage opt-ins would increase the quality of the final group of winners. So, “women just didn’t apply” is a terrible excuse. Overall quality could be increased using outreach.

  10. The hypothesis is indeed very interesting, I tried to find examples from my own field (science, academia). In general, success is measured by publications and I tried to see if women-headed labs publish more or less than male-headed labs. I did not find exactly the type of study I was looking for, but I found data suggesting that women in academia overall publish less than their male counterparts. That in principle goes against your hypothesis but so many factors are involved, that it is hard to figure out what is going on. I’m not a social scientist, so I can’t really elaborate on the quality of these studies. I’m interested in Jamie’s opinions and those of others.



    1. Oh, those are interesting articles. As for the first one, they looked at names and determined whether they were “female” names or “male” names and determined that there were less people with female names represented than male. I would actually be most interested in comparing authorship rates among women with androgynous names to women with feminine names. Additionally, comparing women with androgynous names to men with androgynous names. Gender imbalance in and of itself is a possible consequence of sexism, but there are other things that could cause the gender imbalance that are not related to sexism. But, if they could show that women with feminine names have less papers published than women with androgynous names, that would be convincing evidence that sexism is present in the selection of papers for publication.

      But, the part of that article that I thought provided the most evidence towards sexism being the culprit was this line: ‘Female professors are more likely to emphasize quality over quantity, some scholars argue, turning out fewer but meatier pieces than do their male colleagues, who are more apt to increase their productivity by publishing their work in more-frequent chunks.” They are saying that women have less papers published but those papers are on average BETTER QUALITY than the papers published by male authors. This is EXACTLY what our game theory model would predict if sexism was present.

  11. This is painfully flawed.

    I DO believe that there is a ton of institutional and cultural sexism out there. A TON. But this piece utterly fails to make the case.

    First, and this should be obvious, this kind of test does not lend any proof that a hypothesis is true. Rather, if shows that the these examples do not DISPROVE the hypothesis. It only shows that these examples are consistent with the hypothesis. Not proof.

    Second, two of the three examples (i.e. the data for the proof) rely a logical slight of hand that is not justified. The hypothesis talks about individual ability. It does not address group dynamics, organizational contexts or organizational outcomes. It does not even address individual outcomes. It is JUST about ability. But the first two examples are about organizational outcomes. Organizational success by a particular metric are not necessarily attributable to the leader of the organization. Find the top five schools in the country by SAT scores. Do you think that the principal of those schools is responsible for that organizational outcome? Harvard has the biggest endowment. Do you think the President of Harvard is responsible for that? Look at whatever college/university’s endowment grew by the most (in $ or in %) in 2012. Would you credit the president of that college for that? In fact, the most able leaders can lead the rather mediocre organizations, but face greater challenges and pre-conditions that contribute to organizational outcomes. (the third example, congressfolk, is also an organizational thing. chiefs of staff and LDs have enormous influence, with staff that need to be directed. this example also raises huge questions about multiple goals. it is possible that congressfolk who treat pork are bad, because they trade $$ for votes/integrity. I don’t know. But I would not ever throw out there that THE way (or even a particularly good way) to judge our reps performance is by pork.)

    Third, there is a HUGE question of the causal direction. Perhaps firms that already more successful are more likely to hire women. Perhaps they feel that they have the credibility cushion to go against cultural sexism. Thus, there is a selection effect here, and the causality run counter to what is argued. I actually have no idea. And the data in this “proof” does nothing to tell us.

    I applaud the goal. And I am a fan of using simple thought experiments to try to figure out what is going on and how we might find evidence that an idea is valid or not. I’m not arguing with the theory or the general approach to testing it. But this piece proves nothing, in spite of the its title and it’s author’s intentions. I’m not even convinced that it shows anything about institutional sexism (a much lower bar than proof). As a social scientist, I want to see far better arguments than this, especially when they are trying to advance ideas that I believe in.

    1. Hi Ceolaf. Thanks for the response. You bring up a lot of interesting points. I’ll take them in order.

      1) As for your first point: Yes. You are exactly right, but this is just the way model-based theories like this are. The “proofs” are only as good as the model you built. If you build a shitty model with incorrect assumptions, the results will be meaningless. To disprove the model and it’s results, you need to disprove the assumptions of the model. There could be wrong assumptions underlying it that no one sees, in which case the model could look right but be completely wrong. This also means that you can never 100% prove any type of game theoretic model like this. But, just because you can never get to 100% doesn’t mean we have to throw it out completely. If we did that, we’d have to throw out the bulk of all social science. In the end, we go with what we have and what we know and adjust if we come up with a good reason to think our model isn’t very good.

      2) Actually, the model DOES address group dynamics. You’ll notice that the group with more gender diversity has an overal average qualification rating higher than that of the group with less diversity. The group dynamics aspect only works if the quality of the pools of candidates does not somehow vary along with the gender of the pools. So, lets say you have a firm out in a rural area. In this case, they might have less qualified candidates to choose from and less women to choose from. This would cause our model to come up with a result that makes it look like they are engaging in sexist practices when in fact, they are just choosing from a markedly different pool of candidates. However, I think cases like this are rare and in cases like the first study I mentioned, I don’t believe this to be a factor and the model should still hold.

      As for the other part of this point, yes, for the studies I mentioned to be relevant, you have to assume that the outcomes are at least partially a result of the ability of the individuals within the group. However, if you are going to argue that outcomes have nothing to do with the people, then you’re saying that the outcomes should be randomly distributed across the people, in which case we should not see women consistently having better outcomes than men. This then, goes into your third point.

      3) Yes again, we are assuming a causal direction in these studies and assume that the outcome measures represent quality. Again back to my point that pretty much all social science conclusions tend to conclude with “interesting evidence for the idea but not 100% proof.” I have my own issues with some of the assumptions, especially as you mentioned in the last one regarding Congress. Maybe measuring money brought home to districts is NOT a measure of quality. That depends on your definition of quality and that is completely legitimate. However, if having a problem with the assumptions invalidated the entire study, there would be no valid social science studies. From what I could tell, all three of these studies look pretty valid to me even if I know that some of the assumptions you have to make are suspect. I certainly haven’t come up with a better way to do an unbiased measurement of quality, so if this is the best we can come up with I’m ok with that and will accept it until I have better evidence of a better way to measure. You’re welcome to take up any issues you have with the authors of the studies.

      But seriously, thanks again for your comment. If I were writing this as a scientific paper I would go into all the assumptions that have to be made in order for the results to be valid and all possible problems with those assumptions. But, since it is merely a blog post, that would be a bit distracting, make the post SUPER long, and probably not be interesting to the layperson. But, all your points are valid criticisms of the method.

  12. I’m always annoyed about this “the best person for the job” anyway.
    Usually, you have no clue who that is or might be, because you can’t just put them all on the Holodeck for a month and run a programm.
    For most jobs there are way mor applicants than positions. A considerable amount of those people will be fully qualified to do the job you’re hiring for. I doubt that “the best” could be meassured by any means, but people keep kidding themselves that they’re able to do so and surprisingly still end up with the white guy…

    1. Yep that’s right. Because in the hiring process, there’s always the charm factor where people will hire based on who makes them feel good emotionally.

      And in male dominated organizations, guess who that tends to be.

  13. I have thought about this method before, and I like it. I think it is a powerful piece of evidence that women are not discriminated against. For example: feminists have done a lot of complaining about how peer review is biased against women, yet when you look at paper citation rates, women actually receive the same or less citations, thus indicating that peer review is not a higher bar for them (1, 2, 3).

  14. “If women are just as capable as men and the imbalance is due to institutionalized sexism, then the woman-run hedge funds should do better on average than the man-run hedge funds.” It’s not clear to me how this directly follows from your earlier premises. Could you flesh out your reasoning here a bit more?

    1. I’m not Jamie, but I think the reasoning is in her two initial graphs: If women need to do a lot better to get a position than men then women who are in the same position as a man should be on average mote qualified

      1. Yes what Giliell said. Also, there is some group dynamics at play. You’ll notice in the model that the gender diverse group had a higher avg qualification than the less gender diverse group. This leads to the conclusion that the more gender diverse groups will be able to out-compete their non-diverse competitors.

  15. I’m new to feminism, etc. and would like to know how these points and questions factor in:

    1) The are way more women in “New Age” conferences than men, and way more men in UFO conferences than women.
    2) There are way more men speakers in UFO conferences than women speakers, but what is the proportion of women / men speakers in “New Age” conferences?

    Both points are worth a careful look in my opinion.

    Btw. english is not my native language, so excuse the grammar, etc..

    1. Sometimes the pool of individuals will be more or less gender diverse and in the end that will effect the diversity of the qualified sub-group of speakers chosen from the pool. However, the model will still hold regardless of the make-up of the pool.

      So, lets take a pool with 2/3 men and 1/3 women: 5M 5M 4M 4M 3M 3M 2M 2M 1M 1M 5F 4F 3F 2F 1F
      If we chose the 6 most qualified to be our speakers, we would get: 5M 5M 5F 4M 4M 4F
      Our speaker diversity matches the pool diversity with 2/3 men and 1/3 women. But crucially, the avg qualification of the men is 4.5 and the avg qualification of the women is also 4.5. So, even though there is not a 50/50 gender split, we could say that there was probably not sexism involved in the choosing of speakers.

      However, if we added the one qualification level sexism to our model we would get the following speakers: 5M 5M 4M 4M 5F 3M. We now have 5/6 men and 1/6 women. Also, the avg qualification of the men is only a little over 4 and the avg qualification of the women is 5. Therefore, we could say that sexism was probably present in the selection of individuals.

  16. Jamie, check out this article about gender discrimination in music and specifically orchestras:

    There still is a lot of work that needs to be done to improve the environment for women working and performing in an orchestra, however, the implementation of blind auditions (where the judges can’t see who specifically is playing) have helped quite a bit.

    1. Oh, that article is so interesting! I remember learning about how historically orchestras were almost entirely male until they instituted blind auditions But, I had no idea there was still gender discrimination in orchestras going on today. Orchestras would be smart to institute a blind audition process because it would result in a better orchestra than they would get with a non-blind audition process.

  17. Wow, loved this thought experiment. (I am a woman in a STEM field, worn down from the constant sexism.)
    One request: Please, please! Get the difference between “less” and “fewer” straight. In a snip of your comment (below), I can’t tell if you think that they have fewer qualified candidates, or if they have poorer quality candidates. The next bit literally reads that you think they have partial women. ;-)

    “So, lets say you have a firm out in a rural area. In this case, they might have less qualified candidates to choose from and less women to choose from.”

  18. The model makes a lot of sense. I think there’s a slight deficiency in the first two examples inasmuch as they all report better *collective* performance — which is a prediction of the model, but the model says that the mechanism for better collective performance in less sexist groups is specifically due to higher qualification levels among the women, not overall. So what I’m suggesting is that the first two examples are consistent with less sexist groups having higher qualification levels among men as well as women, without much difference between men and women. It seems like examples like the Congress data provide better support for your model than the CEO and hedge fund data. Maybe the orchestra dissertation has something like that as well?

    1. Oops sorry, my cat walked on my laptop as I was typing and pressed enter. (<– True story!)

      There are two different predictions to the model: Gender diverse groups will have little quality difference between men and women and they will have a higher overall quality rating than the sexist groups.

      In other words, the model takes collective performance into account. The overall qualification level of the gender diverse group in the example was 4 while the sexist group had an avg quality of 3.8. This was the result of less qualified men taking the place of higher qualified women. If these groups were competing, we would expect the diverse group to win most of the time.

      So, when a gender diverse and a sexist group are competing, on average the gender diverse group should do better. This is what we see in example 1. Additionally, in sexist groups, the quality difference between the men and women should be high, with the women rated higher than the men. This is what we see in examples 2 and 3. It's two different results from the same model.

  19. I once took a look at the figures from the APA on women in philosophy departments in the US (philosophy is one of the fields in which women are most poorly represented at all levels: it’s the worst humanity subject for it). They showed that those departments which were ranked higher tended to have a higher percentage of full-time, permanent women faculty than for philosophy departments as a whole (22% of faculty in top-51 institutions, as opposed to 16.6% of faculty nationally).

    I got interested in this after reading about these guys: http://www.nationalpost.com/m/wp/news/blog.html?b=news.nationalpost.com%2F2012%2F08%2F10%2Fphilosophy-gender-war-sparked-by-call-for-larger-role-for-women

  20. Hi.
    I’m curious about whether your proposed model takes into account, or whether you’ve tried to use it for, other possible factors of bias such as racism or class-based discrimination.

    For example, sports seems like a rich area for research:

    http://www.arthurhu.com/index/asports.htm – I frankly don’t know what to make of this page in total – like whether it is somehow arguing for equal representation of Asians in sports – but it was one of the few pages I could find in a quick Google search that attempts to list statistics about representation in sports.

    http://partners.nytimes.com/library/national/race/070200sports-transcript.html – this is a kind of roundtable discussion on the topic.

    Do you think your model explains race or class imbalances in sports? Or why even great athletes who are nonwhite or female seem to have trouble breaking into the management of sports teams? Or how to break down cause and effect in cases where class might figure in (racket sports or equestrian sports)?


    1. If race/class imbalances are caused by racism then it should follow the same model. In order to determine that the lack of Asians in the NBA was caused by racism, you’d have to show that Asian NBA players are on average better players than their non-Asian counterparts. You could do the same with white NBA members or with Black team owners if you think racism might be present there as well. You just need an objective measure of ability to use to compare. In a world with no racism, on average all ethnic groups within the league should play at about the same ability. If a group is better than average, it would indicate that racism may play a factor.

      As you mentioned, ethnic groups are more likely to come from vastly different backgrounds than are men and women. In more posh sports like racket sports or equestrian, as you mentioned, its possible we’re seeing more white people just because they’re more likely to grow up in a community that plays these sports. However the model should still hold. The important part of the model is that the presence of an underrepresented minority does not imply sexism/racism. It is only if the average ability of the minority group is higher than that of the majority group that we can say it was likely caused by sexism/racism. If we measure the abilities of all the players at Wimbledon (because that’s the only tennis thing I know) and we find that the white and non-white players are on average of equal ability, we can say that racism was probably not involved and that the lack of non-white players was caused by other factors (perhaps class, cultural interest, etc). However, if we find that the non-white players are better on average than the white players, it makes it likely that racism is involved.

      That’s why this model is so great. It can separate out whether racism or sexism was involved from non-direct racist/sexist factors that might decrease minority involvement.

  21. Hi Jamie,
    You may want to read “A Case Study of Gender Bias at the Postdoctoral Level in Physics,
    and its Resulting Impact on the Academic Career Advancement of Females” by Sherry Towers. It looks at this effect in physics.

    1. Oh, I found a link to the study: http://arxiv.org/pdf/0804.2026v3.pdf

      I don’t have time to read the whole thing right this second, but from the abstract it definitely seems like this would be another example, The line that stands out to me is “The study ?nds that the female researchers were on average signi?cantly more productive
      compared to their male peers.” That alone shows that there was likely bias involved in getting their research positions.

  22. You make a lot of assumptions to prove your case. First of all, you assume that sexism is the reason for disparities in male dominated fields. Where is your conclusive evidence? Second, you assume that there is such a thing as “perceived equal ability”, as in every resume given to every company always has an equal but gender opposite resume to pick from. How can you even remotely come to that conclusion confidently? Your entire first paragraph is based on the assumption that adding more women makes everything better because they are women. On a scale of one to five, if you have all level five men and a level 4 woman and you only want the top 5 speaking at your conference, it makes no sense to include the level 4 woman by removing a level 5 man. We are talking about merit here. But, lets get to your idea that women are more qualified, on average, than men when there is a large disparity in small group. Since you know how averages work already, I’m going to create a demonstration of how your argument is flawed. There is a disparity in nurses at a hospital. Fifteen of them are women and 5 are men. On average, men are more qualified than women because there are fewer of them. This is probably true no matter what group you use and here’s why; average is the collective amount “qualification” divided by the total. All it takes is a few crappy female nurses to bring the average way down. All it takes for the small number of men to appear better at their job is to be good at it. Less men= Lower failure rate. Antivaxxers use this same way of thinking to show that vaccines are harmful. They say that more unvaccinated kids, on average, are healthier than vaccinated kids, which is true before you discover the fact that way more kids get vaccinated. Obviously you are going to see this result.

    1. Ok, I’ll play!

      1. First of all, you assume that sexism is the reason for disparities in male dominated fields. Where is your conclusive evidence?

      I don’t make the assumption that sexism is the reason for disparities in male dominated fields. I lay out a game theory based model for how to test whether sexism plays a role in male dominated fields.

      2. you assume that there is such a thing as “perceived equal ability”, as in every resume given to every company always has an equal but gender opposite resume to pick from.

      I don’t believe I ever mentioned resumes anywhere. I laid out a game theory model that is mathematically based. It doesn’t matter if you can’t always measure things perfectly in the real world because it’s just a model of the real world.

      3. Your entire first paragraph is based on the assumption that adding more women makes everything better because they are women.

      My entire first paragraph was actually about how some people believe that women are less capable than men and that adding more women makes things worse. It’s pretty clear. Maybe you read it wrong? Try reading it again.

      4. On a scale of one to five, if you have all level five men and a level 4 woman and you only want the top 5 speaking at your conference, it makes no sense to include the level 4 woman by removing a level 5 man.

      In this situation, the final panel would be made up of all men and you would not be able to use the test I laid out to determine whether sexism played a role because there is no average ability of the women on the panel because there are no women on the panel. The model cannot be used for this situation because you can’t divide by 0.

      If you change it slightly to make it a 6 person panel, you would end up with five men on the panel with an average ability of 5 and one woman on the panel with an average ability of 4. Since the average ability of the women is less than the men, the model says that sexism was not a factor in the gender disparity of the panel.

      Out in the world, if we see situations where there is a large gender disparity and the object measurement of ability of the women is lower than that of the men, then likely sexism of the sort where the bar to entry is higher for women is not a factor.

      5. lets get to your idea that women are more qualified, on average, than men when there is a large disparity in small group

      I do not believe that women are more qualified on average than men whenever there is a gender disparity. As I already mentioned, all I did was lay out a model for how to test whether a certain type of sexism was involved in the gender disparity. When there is a gender disparity and the women are on average more qualified than the men, it means that sexism involving a higher bar for entry for women played a part. When women are equal or lower on average than the men, the model says that the type of sexism I laid out was probably not a factor.

      6. There is a disparity in nurses at a hospital. Fifteen of them are women and 5 are men. On average, men are more qualified than women because there are fewer of them. This is probably true no matter what group you use and here’s why; average is the collective amount “qualification” divided by the total.

      What I laid out was a mathematical model that has perfect information, hence the small n used in the examples I provided. If you were going to apply the model out in the real world, you would never, ever apply it to an n of 20. None of the studies I mentioned that included real world data had small n’s.

      If you did try to apply it to an n of 15 you are sort of right and sort of wrong. The smaller the n, the more variation you would see which could result in much higher or much lower averages. In this case, both 15 and 5 are so small that you would have huge variation in both. But, let’s say there were 100 nurses and 95 were women and 5 were men. We could probably get a pretty good average for the 95 women because the n is high. We would have trouble getting a good average for the men because there are only 5 of them and as you mentioned, one outlier could have a big effect on the average. However, it would absolutely not mean that we would always find that the average for the men is always higher than that of the women because there are less of them. It just means it’s more variable. We would be just as likely to see a very low average for the men as a very high average because one outlier could pull the average up or down by a lot.

      However, none of this is a problem for the model I laid out because it’s a model based in game theory and not an actual statistical study of real world events.

      7. Antivaxxers use this same way of thinking to show that vaccines are harmful. They say that more unvaccinated kids, on average, are healthier than vaccinated kids, which is true before you discover the fact that way more kids get vaccinated.

      Yes anti-vaxxers do say that unvaccinated kids are healthier than vaccinated kids, but the reason this can sometimes be the case is not because there are less unvaccinated kids. As I mentioned in my previous point, having a lower n could result in much higher OR much lower averages. The more likely explanation is that families that do not vaccinate are on average more well-off financially and are more likely to do things like eat healthy and promote exercise. Also, families with sick children are far more likely to vaccinate because their children are in more danger if they get diseases. In other words, the problem with a vaccinated versus unvaccinated study is that correlation does not equal causation, not that the number of unvaccinated children is too small.

      In all, you seem a little bit confused about what game theory is and how it works. Although I’ve never read it myself, I’ve heard that this book is a great primer on game theory that is great for people without a deep mathematical background, so if you’re considering learning more you might want to read it. https://www.amazon.com/gp/product/0393310353/

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to top button