Long before fake news was a thing, there were articles about the supposed inferiority of women to men in chess. In most other domains of life, such ideas would be considered reactionary and repulsive; yet, when writing about chess, they are somehow not only acceptable but even mainstream. A few days ago, we saw the latest installment in this unsavory series: an article on the Indian website Mint, titled “Why do women lose in chess?” and reprinted here on ChessBase. Like so many of its predecessors, this article asserts a gender gap in chess achievement, and then speculates about possible contributing factors, such as male gatekeepers, lack of role models, and biological differences. It quotes GM Humpy Koneru as saying that men are just better players, “You have to accept it.”
In good tradition, the article is incredibly sloppy in arguing that there is an achievement gap between women and men to begin with. The author notes that the top two female players, Hou Yifan and Koneru Humpy, are ranked only 86th and 283rd in the world, that no woman has been world champion, and that the gap between the best female and best male player is 205 Elo points. These arguments are all variations on a common theme: whatever metric of top players you use, women are clearly worse than men. There is a huge flaw in this argument: to fairly compare an underrepresented to an overrepresented group, you should never use the top individuals. That is a form of statistical malpractice that wouldn’t stand in an introductory college course.
The Mint article starts out promising. It points that only 16% of the players registered with the All India Chess Federation are female, and states, correctly, “Fewer participants at the entry level results in fewer chances for the top slots.” It then promptly abandons this key argument while giving extensive coverage to folk psychology about “killer instinct” and “emotional sensitivity”.
A thought experiment
Why is this a key argument? It’s really quite simple. Let’s say I have two groups, A and B. Group A has 10 people, group B has 2. Each of the 12 people gets randomly assigned a number between 1 and 100 (with replacement). Then I use the highest number in Group A as the score for Group A and the highest number in Group B as the score for Group B. On average, Group A will score 91.4 and Group B 67.2. The only difference between Groups A and B is the number of people. The larger group has more shots at a high score, so will on average get a higher score. The fair way to compare these unequally sized groups is by comparing their means (averages), not their top values. Of course, in this example, that would be 50 for both groups – no difference!
Indian women play as well as men on average
At this point, you might think that this is just a theoretical argument – surely, when looking at chess ratings, it cannot be that simple? So let’s have a closer look at chess ratings. I downloaded the October 6, 2020 FIDE Standard rating list, selected all players of the Indian federation, and removed all junior players (born 2000 or later), since their ratings are often not reliable. I was left with 19,064 players, of whom 17,899 (93.9%) were male and 1,165 (6.1%) were female. The best male player was a certain Viswanathan Anand at 2753, and the best female player was Humpy Koneru at 2586 – a gap of 167 points. GM Koneru, ranked 15th among all Indian players, is the only female in the top 20. On the surface, these facts superficially seem to point to a gender gap in achievement.
They don’t. With our thought experiment in mind, let’s look at the full rating distributions of male and female Indian players. They look like this (binned from 1000 to 2800 in bins of 50):
The huge discrepancy between the blue and orange lines reflects the participation gap. To compare the distributions more easily, we change the vertical axis from number of players to proportion of players (within each gender):
The line for female players is more jagged because there are fewer of them, but other than that, these two distributions don’t look radically different from each other. Indeed, the average ratings of men (1434) and women (1466) are comparable. And averages are the fairer metric for comparing men and women.
Is 167 points an unexpectedly large gap?
But this does not answer our questions. Is, for example, a gap of 167 points between the male and female top players unexpectedly large? To answer this question, we are now going to look at all ratings as a single pool, dropping the gender identifiers altogether. We then randomly draw 17,899 ratings from this pool. These form the “overrepresented” group, and the remaining 1,165 ratings form the “underrepresented” group. These numbers are exactly the numbers of male and female players in our data, but we have instead created completely arbitrary groups with these numbers of individuals. We record the top rating in both groups. We repeat this process 100,000 times. (For the aficionados: we are following the logic of permutation tests.)
Guess what? The difference between the top ratings in the Overrepresented and Underrepresented groups is a whopping 153 points on average (with a standard deviation of 93). Again, remember that these groups are completely identical to each other except in their number of individuals. The mere fact that the underrepresented group constitutes only 6.1% of the population causes a large difference in top ratings. In this light, the real gap of 167 points could easily be due to chance instead of due to a real difference between women and men, just like the gap in our thought experiment. It is that simple.
Other widely used metrics don’t show evidence for a gender gap either. For example, based on participation alone, one would expect only 1.2 female players in the top 20 overall. So Humpy Koneru being the only female in the Indian top 20 is completely in line with statistical expectations based on the participation gap.
We conclude that at least among non-junior FIDE-rated Indian players, there is no evidence that the “achievement gap” is anything but a participation gap. That is not to deny the first-person perspective of top female players, who might feel that they have reached a personal ceiling in their performance. But statistically, there is nothing to suggest that top female players are underperforming given the overall ratio of female to male players. In fact, taking into account the systemic injustices and biases that they had to overcome to get where they are, they are likely overperforming.
If you want to compare chess achievements between men and women, given their vastly unequal numbers, it is a very bad idea to focus on the top male and female players.
If you insist to focus on the top players, you will have to account for the participation gap using an analysis similar to the one presented here. Just a casual remark won’t do it!
“Even if, hypothetically, you were to find a gender difference based on average ratings (rather than top ratings), you cannot jump to the conclusion that that difference is due to innate or biological factors. The first place to look would be systemic disadvantages and stereotype threat experienced by female players.”
The statistical arguments presented in this article are elementary enough for an introductory statistics class in university. In case you want to repeat the analysis for different countries, you can check my Matlab code for the details. If you prefer to read a published paper, look no further than this excellent paper by Merim Bilalić, Kieran Smallbone, Peter McLeod, and Fernand Gobet (2009). (The free PDF can be found through Google Scholar.) Its title is the first question everyone should be asking: Why are (the best) women so good at chess? And everyone’s second question should be: How can we reduce the insane participation gap?
About the author
Wei Ji (also Whee Ky) Ma is a Dutch FM rated 2324 and a Professor of Neuroscience and Psychology at New York University. He previously explained for Chessbase how a “genius culture” in chess might contribute to excluding women.