Hi everyone! Today I'm going to tell you about the famous Russian game "Chto? Gde? Kogda?". We will try to answer the question: do experts play fairly or there is a percentage of falsification? Below I'll provide a statistical research, where I am considering a distribution of the experts' game results since 2002 till 2013. In order to provide a chance to check my calculations for a skeptical reader, I have attached charts, tables and scripts in program language Python. Here we go!

Data for the analysis was taken from the official website of the game http://chgk.tvigra.ru/letopis/ since 2002 till 2013. Why since 2002? I chose it because of the technical reason: earlier years are more complicated to scrap and I decided to not consider them. To scrap the data from the website I have developed a script. So, the script and its results are at your service below.

chgk_parse.py

letopis.csv

A few words about the analysis. It's supposed, the questions for experts are chosen randomly, this means there's an objective probability of the chance to answer correctly. Since 2002 till 2013 (according to the published results) there were 1799 questions, experts answered 900 ones correctly. Thus the probability is about 50% (a little bit more accurate: 50.0277932184547%). Supposing the randomness of the questions, it's possible to forecast probabilities of all possible results of a game. To do this, it's necessary to follow the formula:

$$ C^{min(z, t)}_{z + t - 1} P^{z} (1 - P)^{t} $$

where P is the probability to answer correctly, z is the score of experts, t is the score of televiewers. In other words, if a game has 6:4 as a result, this means z = 6 and t = 4.

For example, the probability that the result of a game is 6:5 (experts are winners) is 12.3%, the probability to win for televiewers having result 3:6 is 10.9%, etc. Here's entire table:

0:60.016
1:60.047
2:60.082
3:60.109
4:60.123
5:60.123
6:50.123
6:40.123
6:30.110
6:20.082
6:10.047
6:00.016

Of course, this probability has its own reliability. In other words, the actual probability must be in a certain range, for example, with 95% of probability (2 sigma). Using one more script, let's calculate real probabilities of the game results and compare them with the theoretical model forecasts. Look at the chart:

chgk_analyse.py

Red color shows the real (observed) probability of appropriate score, yellow color shows the forecasts of the theory. Blue ranges are the range of 95%, i.e. observed probabilities must be inside these ranges (with 95% of probability). If an observed probability is outside the range, it's possible, but in rare case (5% for each score). To clarify the situation I added green columns, they show one possible case, generated by the theoretical model.

As we can see, all green columns are inside the theoretical ranges. The probability of that is 0.95^12 = 54% (12 because we have 12 columns and each of them is in the range of 95%). Basically, even if one column was outside the range, it's not a something terrible; the probability of this is 12 * 0.05 * 0.95^11 = 34% (in other words, it's a possible case either). Thus the probability that two or more columns are outside the ranges is 12%. Now let's take a look at the chart: the actual game results (red columns) show, that 3 columns are outside the ranges. Statistics shows that the probability of this case is 2%. In other words, there're 2% that the game results fit statistics.

Which conclusion can we make? As I told, the probability to observe such game results is 2%. 2% against 98% - you can make conclusions by your own! Besides, we can notice, that televiewers win too rarely having a crushing score (like 3:6). And experts win too often having the score 6:5; the probability to answer correctly having the score 5:5 is 70%, although it's 50% for any other score. It seems, the guys feel the responsibility and are able to rally at important moments! But maybe the reason is different?