In this article, I tell about a known mathematical task called "choice problem". In some sources its name can be "secretary problem", "marriage problem", etc. Some general information about it you can find on wikipedia. Here I focus on the mathematical aspect mostly, suggesting my own view on it. Also in the end I list some tables with the solutions of the task that can be helpful in our everyday life.

Formulation

In a kingdom there was a princess who did not want to marry, she even completely ignored datings with any men during the years. One day the patience of the king had been over and he made a condition: tomorrow he invites 100 princes from different kingdoms who will come into her room one by one in random order introducing themselves and making marriage proposals. The princess can accept a proposal, so the other princes go away immediately, or she can reject, so this rejected prince goes away forever and the next one comes in. According to the condition, the princess cannot reject everybody: if the last one comes in she has to accept, no matter she likes him more or less than the previous ones. The question is what strategy should the princess follow to get a prince as good as possible?

No need to explain why this task is so popular to discuss. Such or similar choice problem takes place in many areas like choosing a husband or an employee. And in practice everybody resolves it as he finds correct.

Let us talk about details. Certainly, it is interesting to generalize the task for an arbitrary number of princes. Also intentionally I formulated unclearly about the criterion of "as good as possible". The point is that this criterion can be different providing different solutions. Classically, the criterion is to maximize the probability to stop at the best one among the all came princes.

1. The maximization of the probability to choose the best prince.

Also I find interesting to consider a different criterion, where we stop at one of the best ones. Supposing, we can sort the princes making a ranking: the best one has number 1, the next one has 2, etc. We can search for a strategy that minimizes the mean of the chosen prince's ranking number.

2. The minimization of the mean of the chosen prince's rank.

It is obvious, the first criterion will not give a big probability even in case of the optimal strategy. Indeed, choosing a prince according to the condition we have to ignore all the next ones where possibly there is a better one. Also if the best prince is among first ones, we can take him typical hoping to find a better one because of the bigger quantity of the next princes. So if we miss the best one, we will have to accept the last one. Thus likely we finally have the best one or a random one, saying, 54-th in the ranking: best or nothing (random one), in other words. In real life if we miss the best one, we can be satisfied by 2-nd or 3-rd in the ranking, but not 54-th. This is why I find the second criterion more applicable. Below I consider both criterions.

Criterion 1: The maximization of the probability to choose the best prince.

Considering a next prince, the princess shouldn't accept if he is not better than all she has considered before. Otherwise, she chooses not the best one for sure; it is better to hope for a better one among the next unseen ones than to fail with 100%. Thus the signal to think about accepting is the current prince is the best among the all she has already seen. The common sense tells: if the current one is better than previous ones (the signal happened), and the number of them is little like 1 or 2, she shouldn't stop at him because there are too many princes ahead where likely she will meet a better one. On the other hand, if the number is big enough like 98 or 99, she should accept, because it is too unlikely to find a better one among the next ones.

All these thoughts lead to there is a number of princes at the beginning that the princess should reject in any case. After them she should accept the one if he is better than the all she has seen. This is the optimal strategy. And now all we need to do is to find the number. According to the criterion, we will maximize the probability to stop at the best one varying the number. This way will give us the optimal number and the value of the probability.

Let n be the total number of princes and p be the number of princes that the princess should reject in any case at the beginning. And now we are considering a k-th prince ($ k > p $). The probability of that he is better than the all previous ones is:

$$ \frac{1}{k} $$

And the probability of that the princes from (p+1)-th to (k-1)-th have never been better than the all previous ones (so they could not be chosen according to the strategy) is:

$$ \left(1 - \frac{1}{p+1}\right) \left(1 - \frac{1}{p+2}\right) ... \left(1 - \frac{1}{k-1}\right) $$

Thus the probability of stopping at k-th prince is:

$$ P_{1}(k) = \left(1 - \frac{1}{p+1}\right) \left(1 - \frac{1}{p+2}\right) ... \left(1 - \frac{1}{k-1}\right) \frac{1}{k} = \frac{p}{p+1} \frac{p+1}{p+2} ... \frac{k-2}{k-1} \frac{1}{k} = \frac{p}{k(k-1)} $$

The chosen k-th prince is the best one if all the next ones are worse. The probability of it is:

$$ P_{2}(k) = \left(1 - \frac{1}{k+1}\right) \left(1 - \frac{1}{k+2}\right) ... \left(1 - \frac{1}{n}\right) = \frac{k}{k+1} \frac{k+1}{k+2} ... \frac{n-1}{n} = \frac{k}{n} $$

So the probability of choosing the best prince on k-th number is:

$$ P(k) = P_{1}(k) P_{2}(k) = \frac{p}{n(k-1)} $$

To get the total probability of choosing the best prince, we need to sum all the probabilities for k from (p+1) to n:

$$ P_{best} = \sum_{k=p+1}^{n} P(k) = \frac{p}{n} \left(\frac{1}{p} + \frac{1}{p+1} + ... + \frac{1}{n-1}\right) $$

Well. This is the probability we need to maximize. It depends on p as a parameter. The condition of optimality is:

$$ \begin{cases} P_{best}(p) \ge P_{best}(p + 1) \ P_{best}(p) > P_{best}(p - 1) \end{cases} $$

For the first part:

$$ P_{best}(p) \ge P_{best}(p + 1) $$ $$ \frac{p}{n} \left(\frac{1}{p} + \frac{1}{p+1} + ... + \frac{1}{n-1}\right) \ge \frac{p + 1}{n} \left(\frac{1}{p+1} + ... + \frac{1}{n-1}\right) $$ $$ \frac{1}{n} + \frac{p}{n} \left(\frac{1}{p+1} + ... + \frac{1}{n-1}\right) \ge \frac{p}{n} \left(\frac{1}{p+1} + ... + \frac{1}{n-1}\right) + \frac{1}{n} \left(\frac{1}{p+1} + ... + \frac{1}{n-1}\right) $$ $$ 1 \ge \frac{1}{p+1} + ... + \frac{1}{n-1} $$

Similarly for the second one:

$$ 1 < \frac{1}{p} + \frac{1}{p+1} + ... + \frac{1}{n-1} $$

Rewriting it to the form:

$$ \begin{cases} \frac{1}{n-1} + \frac{1}{n-2} + ... + \frac{1}{p+1} \le 1 \ \frac{1}{n-1} + \frac{1}{n-2} + ... + \frac{1}{p+1} + \frac{1}{p} > 1 \end{cases} $$

In other words, to find the optimal p, we should accumulate the sum $ \frac{1}{n-1} + \frac{1}{n-2} + ... $ until it exceeds 1. The denominator, on which it happens, is the optimal p.

It is interesting to consider the case of big n. There is an approximation:

$$ 1 + \frac{1}{2} + \frac{1}{3} + ... + \frac{1}{m} \approx ln(m) $$, for m >> 1

Thus:

$$ \frac{1}{n-1} + \frac{1}{n-2} + ... + \frac{1}{p+1} \approx ln(n) - ln(p) = ln\left(\frac{n}{p}\right) $$, for n >> 1, p >> 1

So the optimal condition is:

$$ ln\left(\frac{n}{p}\right) = 1 $$ $$ p = \frac{n}{e} $$

Where $ e \approx 2.718281828... $

And the value of the probability to choose the best prince is:

$$ P_{best}(p) = \frac{p}{n} \left(\frac{1}{p} + \frac{1}{p+1} + ... + \frac{1}{n-1}\right) \approx \frac{1}{e} $$

In case of big n, it is necessary to reject the part $ \frac{1}{e} $ of princes and, considering the others, we should accept the one who is better than the all previous ones. For n = 100 the part $ \frac{1}{e} $ is 37 princes who should be rejected. This strategy gives success with the probability $ \frac{1}{e} \approx 37\% $

Criterion 2: The minimization of the mean of the chosen prince's rank.

The princes can be ranked from 1 to n. In the end the princess accepts a prince with a rank r. The fewer r the better. Below we will consider a strategy that will lead to different r depending on the princes' order. So we need to consider the mean of r. The strategy is better if it provides less the mean of r.

Supposing, we are considering k-th prince in the order they come in. He can be inserted into the ranking of princes that are already considered before. And this prince has the rank r in this ranking. The strategy should answer the question: should we accept this prince having the numbers k and r? If we accept him, his general rank (its mean value) can be the same or worse because unseen princes can displace him. Denote the general (final) rank as:

$$ r_{chosen} $$

If we reject him and continue searching, our strategy will lead to a mean of rank that we denote as:

$$ r_{non-chosen} $$

Because we minimize the mean of rank within the strategy, the criterion to accept must be:

$$ r_{chosen} \le r_{non-chosen} $$

This is the main idea, but the details matter. Below we will get explicit formulas for the means that will help us to do the right choice at every step. I suggest to start our consideration from the end. Supposing, we have not chosen anybody until the last prince. Obviously, the mean of his rank is:

$$ r_{last} = \frac{1+2+...+n}{n} = \frac{n+1}{2} $$

Now let's suppose, we are considering the penultimate one who has r-th place in the current ranking. The last prince can be worse: then the penultimate one keeps r-th place, or better: then the current one will be moved to (r+1)-th place. The probabilities of these are $ \frac{n-r}{n} $ and $ \frac{r}{n} $. Thus the mean in case of choosing is:

$$ r_{chosen} = r \frac{n-r}{n} + (r+1) \frac{r}{n} = r \frac{n+1}{n} $$

If we don't choose, we will have to accept the last one with the mean:

$$ r_{non-chosen} = r_{last} = \frac{n+1}{2} $$

So we have the condition for (n-1)-th step:

$$ r_{chosen} \le r_{non-chosen} $$ $$ r \frac{n+1}{n} \le \frac{n+1}{2} $$ $$ r \le \frac{n}{2} $$

It corresponds to the common sense: if the penultimate one is better than average one, he must be chosen, because the last one is statistically average.

Now we go from particular case to generalization. Supposing, we are considering k-th prince who has r-th rank in the current ranking of the princess. We have (n-k) princes ahead. So the probability $ r_{chosen} $ is a function:

$$ r_{chosen} = r_{chosen}(k, r) $$

Obviously:

$$ r_{chosen}(n, r) = r $$

Because the last prince will not change his r-th rank, we don't have princes after him.

I have thought much time how to explain the calculation of the function $ r_{chosen}(k, r) $. I found many ways to do it, but finally I decided to explain it with mathematical induction. Actually, I don't like this approach to show the essence, but in this particular case I think it is the best and simplest choice.

The formula we want to prove is:

$$ r_{chosen}(k, r) = r \frac{n+1}{k+1} $$

1. Base case.

$$ r_{chosen}(n, r) = r \frac{n+1}{n+1} = r $$

It is true, we saw it above.

2. Inductive step.

$$ r_{chosen}(k, r) = r \frac{n+1}{k+1} => r_{chosen}(k-1, r) = r \frac{n+1}{k} $$

When we have (n-k+1) princes ahead we can reduce the task to (n-k) princes. Because after the next one the rank of the current one will stay r with the probability $ \frac{k-r}{k} $, or will be moved to (r+1) with the probability $ \frac{r}{k} $. Thus:

$$ r_{chosen}(k-1, r) = \frac{k-r}{k} r_{chosen}(k, r) + \frac{r}{k} r_{chosen}(k, r+1) = \frac{k-r}{k} r \frac{n+1}{k+1} + \frac{r}{k} (r+1) \frac{n+1}{k+1} = r \frac{n+1}{k} $$

That is what we needed to prove. Now we know the function $ r_{chosen}(k, r) $.

Now let's consider $ r_{non-chosen} $. It is a bit more complicated, because after non-choosing (rejecting) we can accept at any next step. For the penultimate one we have already calculated:

$$ r_{non-chosen}(n-1) = r_{last} = \frac{n+1}{2} $$

To clarify, let's consider (n-2)-th step. If we reject, at (n-1)-th step we can have any rank from 1 to (n-1) with identical probabilities. Depending on $ r_{chosen}(n-1, r) $ we accept or reject accepting the last one. So:

$$ r_{non-chosen}(n-2) = \frac{1}{n-1} \sum_{r=1}^{n-1} min(r_{chosen}(n-1, r), r_{non-chosen}(n-1)) $$

Following the same logic, we can write the recursive formula for k-th step:

$$ r_{non-chosen}(k) = \frac{1}{k+1} \sum_{r=1}^{k+1} min(r_{chosen}(k+1, r), r_{non-chosen}(k+1)) = \frac{1}{k+1} \sum_{r=1}^{k+1} min(r \frac{n+1}{k+2}, r_{non-chosen}(k+1)) $$

So any value of $ r_{non-chosen}(k) $ can be calculated recursively from the end.

Well. Now we can calculate the values of $ r_{chosen}(k, r) $ and $ r_{non-chosen}(k) $ for any k and r. It is enough to use the strategy.

It's interesting to take a look at the value $ r_{non-chosen}(0) $. It shows the mean of the best prince's rank before we start considering them. In other words, this is the target value we minimize.

In formulas everything looks complicated and too theoretical. Let's look at some examples.

Numerical examples

Let's consider the simplest non-trivial case. Supposing, we have 3 princes instead of 100. It's obvious, we shouldn't accept the first one, because we have too little information. Accepting him we get rank 2 in the end (averagely). We can't pass the 2-nd one in any case, otherwise this would mean accepting the last one that brings rank 2 in the end. Considering the 2-nd one, he can be better or worse than the first one (with the probabilities 50%/50%). If he is better and we accept him we get rank 1 with the probability 2/3 and rank 2 with the prob 1/3. If the second one is worse, we reject and accept the last one with rank 2. So this strategy gives us the mean rank:

$$ \frac{1}{2} (\frac{2}{3} 1 + \frac{1}{3} 2) + \frac{1}{2} 2 = \frac{5}{3} $$

That is better than 2.

On the above formulas' level this calculation would look like this:

1) Calculating $ r_{non-chosen} $:

$$ r_{non-chosen}(n-1) = r_{non-chosen}(2) = \frac{n+1}{2} = \frac{3+1}{2} = 2 $$ $$ r_{non-chosen}(1) = \frac{1}{k+1} \sum_{r=1}^{k+1} min(r \frac{n+1}{k+2}, r, n-k-1), r_{non-chosen}(k+1)) = \frac{1}{2} \sum_{r=1}^{2} min(r \frac{4}{3}, 2) = \frac{1}{2} (min(\frac{4}{3}, 2) + min(\frac{8}{3}, 2)) = \frac{1}{2} (\frac{4}{3} + 2) = \frac{5}{3} $$

2) Considering the first prince. Of course, he has r = 1. Calculating $ r_{chosen} $:

$$ r_{chosen}(k=1, r=1) = r \frac{n+1}{k+1} = 2 $$

Because of $ \frac{5}{3} < 2 $ we reject the first one.

3) Doing the same for the second prince, we understand that if he has r = 1, he must be accepted, if r = 2 not.

Now it is interesting to consider a few particular cases for different n. I have prepared the tables with the optimal solutions.

n = 5, r = 2.05

1       nobody
2 - 3   1
4       1 - 2
5       1 - 5
n = 10, r = 2.558

1 - 3   nobody
4 - 5   1
6 - 7   1 - 2
8       1 - 3
9       1 - 5
10      1 - 10
n = 20, r = 3.002

1 - 5   nobody
6 - 10  1
11 - 13 1 - 2
14 - 15 1 - 3
16      1 - 4
17      1 - 5
18      1 - 7
19      1 - 10
20      1 - 20
n = 30, r = 3.204

1 - 8   nobody
9 - 15  1
16 - 19 1 - 2
20 - 21 1 - 3
22 - 23 1 - 4
24      1 - 5
25      1 - 6
26      1 - 7
27      1 - 8
28      1 - 11
29      1 - 15
30      1 - 30
n = 40, r = 3.328

1 - 11  nobody
12 - 19 1
20 - 25 1 - 2
26 - 28 1 - 3
29 - 30 1 - 4
31 - 32 1 - 5
33      1 - 6
34      1 - 7
35      1 - 8
36      1 - 9
37      1 - 11
38      1 - 14
39      1 - 20
40      1 - 40
n = 50, r = 3.412

1 - 13  nobody
14 - 24 1
25 - 30 1 - 2
31 - 34 1 - 3
35 - 37 1 - 4
38 - 39 1 - 5
40 - 41 1 - 6
42      1 - 7
43      1 - 8
44      1 - 9
45      1 - 10
46      1 - 12
47      1 - 14
48      1 - 18
49      1 - 25
50      1 - 50
n = 100, r = 3.603

1 - 27  nobody
28 - 47 1
48 - 59 1 - 2
60 - 67 1 - 3
68 - 73 1 - 4
74 - 77 1 - 5
78 - 80 1 - 6
81 - 82 1 - 7
83 - 84 1 - 8
85 - 86 1 - 9
87      1 - 10
88 - 89 1 - 11
90      1 - 12
91      1 - 14
92      1 - 15
93      1 - 17
94      1 - 19
95      1 - 21
96      1 - 25
97      1 - 30
98      1 - 37
99      1 - 50
100     1 - 100

The case of our princess is n = 100. According to the solution, she must ignore first 27 princes. From 28-th to 47-th she must accept the one if he is better than the all she has already considered. From 48-th to 59-th she must accept if he is the best of 2-nd after the best one among those whom she has considered. Etc. This strategy provides the mean of rank 3.603 in the total ranking. I think it is a very good result to have 3-rd or 4-th man in the ranking among 100 ones having so hard conditions from the king.