The game of Go involves a ranking system to allow people at different levels of strength to play fair games. The system is widely employed in both in-person and online games, and some variation of it is used by every major Go server and organization.
The ranks go from 30k to 1k, and then 1d to 9d. 1d is 1 rank (or “stone”) from 1k, and 5k is 4 stones from 9k. There are some problems with this system (e.g., in reality the ratings are neither linear nor transitive), but they provide a good way of estimating someone’s strength before you play them for the first time.
In this essay we are going to examine the KGS rating data from 16 March 2008 for confident ranks. One of the key skills in data analysis is an understanding that data is data, and so it is a useful exercise to examine arbitrary data sets to see what you can discover. With this in mind, let’s examine the data and see what falls out.
The values here are rounded interval data: The zero point is arbitrary and while differences are useful (a 5k is 3 ranks away from a 2k), summation isn’t (a 3k + a 2k has no meaning). 4 stones is a bit of an interesting number, because the majority of games take place within a 4 stone handicap.
Before we can leap in and perform our analysis, however, it pays to look at the histogram to see the distribution of the data. We should see a few different things:
What this means is that our distribution isn’t really normal, so we should be very careful about using the arithmetic mean, since the skew will bias our result. We could legitimately calculate it with this data, but it would be tempting to make errant conclusions, so we will stick to using quantiles.
I’ve marked on here the 10%, 25%, 50%, 75%, and 90% quantiles representing how many people fall below each value. The values for these are:
- 10%: 12k
- 25%: 8k
- 50%: 4k (Median)
- 75%: 1k
- 90%: 2d
This feels very compressed: over 25% of the players are between 4k and 1k. It also means that only 10% of KGS is below 12k in strength, and over 50% of the players are between 8k and 1k. Someone who is around 4k is within 4 stones of the majority of the server.
There are a couple of different reasons that this could be, and we’ll explore the reasons more when we look at the data for the American Go Association.