Mathematical Recreations

Introductory Problem: Solutions
I've attacked this problem numerically and analytically. The numerical work is contained in a
spreadsheet in Microsoft Excel format. I've tried to be as transparent as
possible about the layout and formulas in the spreadsheet, but I've put together a brief overview
here.
Smarties® come in six different colors and there are fifteen candies in each roll.
 Assuming the colors are individually random with equal probability, what is the chance of getting a
roll of candy with at least one of each color? (See spreadsheet or
analytical solutions, but the numerical answer is about 64.4%)
 With the same assumption, what is the chance of getting a roll of candy with at least two of each color?
(See the spreadsheet solution, but the numerical answer is about 7.3%)
 With the same assumption, what is the chance of getting a roll of candy with k colors, 0<k<6?
(See spreadsheet or analytical solutions)
 With the same assumption, what is the chance of getting a roll of candy with six consecutive candies,
each a different color? (I don't really have a satisfactory analytical solution, but you can see the
spreadsheet solution. The numerical answer is about 12.3%)
 With the same assumption, given that you have a roll with six different colors, what is the probability that
the roll has six consecutive candies with different colors? (This is a trivial combination of the answers to the
first question and the previous question. It is about 12.3%/64.4%, or about 19.1%)
 With the same assumption regarding randomness, given n possible colors for each of m locations,
what is the probability of having k different colors in the set? (See the analytical
solution below)
 With the same assumption, what is the probability of having at least one set of n consecutive locations
where each is a different color? I suspect that this may not be solvable in any closed form. In any case, I don't
have a closedform solution yet. A general solution with linear computational complexity is
here, and a method for generating analytical solutions for specific values of n
is here.
 In an infinite string of ordered locations, each of which can be one of n different colors, what is the
rate at which sequences of n different colors occur? The intent is that a location can only be part of one
group of n. Specifically, the answer for n = 2 is not 1/2; it is 1/3 since a sequence
like AABAA would count once, not twice. (I have a closedform solution for this one
below. There are also two similar questions that I came up
with. The first one is "What is the probability that a particular sequence of n locations
will all have different colors?" The answer to this one is n!/n^{n}. The other
similar question is: "What is the probability that a particlar location will be part of
at least one sequence of n different colors?" I haven't attempted to solve this one yet, but I think it's
doable, probably in closed form. Note that it's a special case of the previous question where
m = 2n –1. Perhaps someone will tackle it.)
Please send responses to
Thanks,
Steve

Here we consider the general case of m items, each of which can be any of n different
colors. I'm going to go through the process of "discovering" the solution, then deriving it.
This mirrors the way I actually solved the problem
First, let's consider the probability that all m items are all the same color. That is, the
case where k = 1. There are n possible colors. The probability that any given
item is a specific color is equal to 1/n, so we find:
(1.1) 
P(k = 1) =
n (1/n)^{m} = n x_{1},

where x_{1} is the probability that all of the items are a specific color.
If we now consider the case where k = 2, we observe that there are n(n  1)/2 different
ways to choose two colors. The probability that any given item is one of two specific colors is 2/n, but
we don't want to count the cases where all of the items are the same color. For each choice of two colors, there
are two possible ways that the items could be one color. So we find:
(1.2) 
P(k = 2) =
[n(n – 1)/2] [(2/n)^{m} – 2(1/n)^{m}] =
[n(n – 1)/2] x_{2}.

Similarly, for k = 3, we note that, for each choice of three colors, there are three ways to
choose two of those colors and three ways to choose one of those colors, so:
(1.3) 
P(k = 3) 
= [n(n – 1)(n – 2)/6]
[(3/n)^{m} – 3x_{2} – 3x_{1}]


= [n(n – 1)(n – 2)/6]
[(3/n)^{m} – 3(2/n)^{m} + 3(1/n)^{m}]
= [n(n – 1)(n – 2)/6] x_{3}.


At this point, we can start to notice some patterns. The coefficient for the number of ways to choose
k colors out of a possible n colors is, of course, the normal binomial coefficient
n!/k!(n – k)!. Furthermore, the coefficients of
x_{j} are also the binomial coefficients k!/j!(k – j)!.
So we can write the general form:
(1.4) 
P(k) = 
n!
k!(n – k)! 
( 
(k/n)^{m} – 
_{k – 1}
∑
^{j = 1}

k!
j!(k – j)! 
x_{j} 
) 
= 
n!
k!(n – k)! 
x_{k} 

Still, this is not in closed form. Let's do one more specific case for k = 4:
(1.5) 
P(k = 4) =
[n!/4!(n – 4)!] [(4/n)^{m}
– 4(3/n)^{m} + 6(2/n)^{m}
– 4(1/n)^{m}] =
[n!/4!(n – 4)!] x_{4}.


Now we can notice that the coefficients of (j/n)^{m} for each P(k)
seem to be binomial coefficients as well, but
with alternating sign. The coefficient of (j/n)^{m} seems to be
(–)^{(k – j)}[k!/j!(k – j)!].
(Here I've used the notation (–)^{(k – j)} = 1 if
k – j is even and = –1 if k – j is odd.)
Let's see if we can derive the observed result. (I actually went up to k = 6 before I decided
to tackle the derivation.) We've shown that
(1.6) 
x_{k} = 
_{k}
∑
^{j = 1}

(–)^{(k–j)}_{ } 
k!
j!(k – j)!



for j < 5, but we can also substitute Eq. 1.6 into Eq. 1.4 to get a result for
k = 5. (We expect the result to be Eq. 1.6, in which case, it will hold for all k. In
case it's new to anyone, this technique is called the principle of mathematical induction.) We find
(1.7) 
x_{k} = 

– 
_{k – 1}
∑
^{j = 1}

k!
j!(k – j)!

_{j}
∑
^{i = 1}

(–)^{(j–i)} 
j!
i!(j – i)!



We want to find the coefficient for each (i/n)^{m}, so we note that the
order of the sum can be switched. For i = 1, there is a term for each j greater
than or equal to one. Similarly, for arbitrary i, there is a term for each j greater than
or equal to i. We rewrite the equation as
(1.8) 
x_{k} = 

– 
_{k – 1}
∑
^{i = 1}


_{k – 1}
∑
^{j = i}

(–)^{(j–i)} 
k!
j!(k – j)!

j!
i!(j – i)!



Looking at Eq. 1.6, we see that we want to pull out a factor of
k!/i!(k – i)!, and we can also cancel out the like terms
of j! to get
(1.9) 
x_{k} = 

– 
_{k – 1}
∑
^{i = 1}

k!
i!(k – i)!

_{k – 1}
∑
^{j = i}

(–)^{(j–i)} 
(k – i)!
(k – j)!(j – i)!



The sum from j = i to k – 1 isn't going to simplify,
so let's substitute x = k – j to get a sum from one to
k – i.
(1.10) 
x_{k} = 

– 
_{k – 1}
∑
^{i = 1}

k!
i!(k – i)!

_{k – i}
∑
^{x = 1}

(–)^{(k–i–x)} 
(k – i)!
x!(k – i – x)!



Now we can see the light of day. We can be pretty sure that the inner sum simplifies to a sign,
and it simply remains to understand why. Now that we've switched
the variables, we can notice that the inner sum is over the binomial coefficients of
k – i with alternating sign. We know that the sum over all such coefficients
must be equal to zero. (If you don't already know this, look at
(y – 1)^{(k–i)} for y = 1.) The inner sum is
over all of the binomial coefficients of k – i except for x = 0.
The inner sum is therefore simply equal to –(–)^{(k–i)}, and we get
(1.11) 
x_{k} = 

+ 
_{k – 1}
∑
^{i = 1}

(–)^{(k–i)} 
k!
i!(k – i)!


, 

which simplifies trivially to Eq. 1.6. We find, therefore, that
(1.12) 
P(k) = 
n!
k!(n – k)!

_{k}
∑
^{j = 1}

(–)^{(k–j)} 
k!
j!(k – j)!



The method used in AF5:BA21 of the spreadsheet for n = 6
and m = 15 can be applied to arbitrary n and m. (The computation in the
spreadsheet tracks more information than is needed here, since it is keeping track of the number of
colors which have ever occurred, in addition to the length of the current sequence.)
We can start by defining some notation. Let p_{m} be the probability that a sequence
of n differently colored locations will be found within a series of m ordered locations.
If we keep track of state as we step through the locations, then at any given time we will have a current
sequence of i locations with another j locations remaining to be examined. We define
p_{i, j} as the probability of finding a sequence of n differently
colored locations out of a series of i + j ordered locations, given that the first
i locations are known to be differently colored. We are looking for
p_{m} = p_{0, m}.
We can start by noting that
(2.1) 
p_{0, m} 
= p_{1, m – 1} 
p_{i, 0} 
= 0 
for i < n, and 
p_{n, j} 
= 1 
for all j.


Let's start with the case where we have only one location left. We can see that
(2.2) 
p_{i, 1} 
= 0 
for i < n – 1 
p_{n – 1, 1} 
= 1 / n. 

For all i < n and all j >0,
(2.3) 
p_{i, j} =
p_{i + 1, j – 1} (n – i) / n
+

i
∑
k = 1

p_{k, j – 1} / n. 

This can be used to calculate all p_{i, j} for successively larger values
of j for any value of n. The computational complexity is linear in n and in j.
After examining j different locations, you have either found a sequence of n different
locations, or you have a current state with a sequence of i different locations. Think of the
state after examining j locations as a vector, where element i is the probability that
you have a current sequence of i different locations.
The initial vector after examining one location is (1, 0, 0, ...). The sum of the elements is
one minus the probability that a sequence of n consecutive differently colored locations has been
found.
There is an operator which transforms the vector each time a new location is examined. For any
specific value of n, we can find eigenvectors x such that x transforms into
c x, where c is a constant (called the eigenvalue) which is characteristic of that
eigenvector.
By finding the n – 1 different eigenvectors and decomposing
(1, 0, 0, ...) into a linear combination of those eigenvectors, we can create a general
solution for any specific n.
Let's look at the trivial case where n = 2. Each vector has one element. After
examining one location, you will always have a current sequence of one differently colored location,
corresponding to a vector of (1). After examining a second location, you will complete a sequence of
two different colors half the time. The other half of the time, you choose the same color again, and
you have a sequence of one differently colored location, corresponding to a vector of (1/2). After
examining three locations, the vector is (1/4), and so on. The eigenvalue is 1/2, and the
probability that you've found a sequence of n = 2 different locations after examining
m locations is given by
(3.1) 
p = 1 – 1/2^{(m –1)}. 
For n = 3, the situation is no longer trivial. Let's write the vector as
(x_{1}, x_{2}). When you look at another location,
(x_{1}, x_{2}) transforms into
(3.2) 
(x_{1} + x_{2}, 2x_{1} + x_{2}) / 3.

Our eigenvectors will, therefore satisfy
(3.3) 
(x_{1} + x_{2}) / 3 
= c x_{1} 
(2x_{1} + x_{2}) / 3 
= c x_{2} 

It will be convenient to define g = 3c – 1 (or more generally,
g = nc – 1) while we solve the
linear equations. We find
(3.4) 
x_{2} 
= g x_{1} 
x_{1} 
= g x_{2} / 2
= g^{2}x_{1} / 2.


This can immediately be solved to find that g = ±√2. For the positive
square root, we find an eigenvector proportional to (1, √2) with an eigenvalue of
(1 + √2)/3. For the negative square root, we
find an eigenvector proportional to (1, –√2) with an eigenvalue of
(1 – √2)/3.
The vector x after examining one location can be written as
(3.5) 
x = (1, 0) = (1, √2) / 2 + (1, –√2) / 2.

We know how each eigenvector evolves, so we can immediately write the state after examining j
locations:
(3.6) 
x_{j} = 
1 2 
(1, √2) 
( 
1 + √2
3

) 
^{(j – 1)} 
+ 
1 2 
(1, –√2) 
( 
1 – √2
3
 ) 
^{(j – 1)} 

From here, we can easily find the probability that a sequence of n = 3
differently colored locations has been found after examining m locations:
(3.7) 
p = 1 – 
3 2 
( 
1 + √2
3

) 
^{m} 
– 
3 2 
( 
1 – √2
3

) 
^{m} 

At this point, you may want to skip to the next section. The basic techniques for doing n > 3
are pretty much the same, but the math gets more complex. That is, the eigenvectors and eigenvalues may
be complex numbers. To demonstrate how the system evolves with complex solutions, I'm providing a solution
for n = 4. Feel free to skip it unless you are truly intrigued.
Our simultaneous equations for n = 4 are
(3.8) 
x_{2} + x_{3} 
= g x_{1} 
3x_{1} + x_{3} 
= g x_{2} 
2x_{2} 
= g x_{3} 

From these, we find
(3.9) 
x_{2} 
= g x_{3} / 2 
x_{1} 
= (g^{2} – 2) x_{3} / 6 
g^{3} 
= 5g + 6. 

Solving the cubic equation is tedious, but not hard. Obviously, the real root can be found numerically,
but the analytical formula also works:
(3.10) 
g_{0} 
= (3 + √(118/27))^{1/3} + (3 – √(118/27))^{1/3} 

= 2.689095324 

This leaves the residual quadratic
(3.11) 
g^{2} + g_{0} g + 6/g_{0} = 0.

This quadratic has complex roots
(3.12) 
g_{±} 
= –g_{0}/2
± i √(3g_{0}^{2} – 20) / 2


= –g_{0}/2
± i g_{i}/2,


where this equation defines
g_{i} = √(3g_{0}^{2} – 20). Note that it is not
necessary to have any powers of g_{0} greater than g_{0}^{2} or any powers of
g_{i} at all, since these can be eliminated using the following identities:
(3.13) 
g_{0}^{3} 
= 5g_{0} + 6 
g_{i}^{2} 
= 3g_{0}^{2} – 20. 

We find the following three eigenvectors for n = 4
(3.14) 
x_{0} 
= (g_{0}^{2} – 2,
3g_{0}, 6)

x_{+} 
= (6 – g_{0}^{2}
– i g_{0}g_{i},
–3g_{0} + 3i g_{i}, 12)

x_{–} 
= (6 – g_{0}^{2}
+ i g_{0}g_{i},
–3g_{0} – 3i g_{i}, 12)


With a little bit of work, we can express (1, 0, 0) in terms of the three eigenvectors:
(3.15) 
(1, 0, 0) = 
_{ }1
3g_{0}^{2} – 5

x_{0} + 
–g_{i} + 3i g_{0}
4g_{i} (3g_{0}^{2} – 5)

x_{+} + 
–g_{i} – 3i g_{0}
4g_{i} (3g_{0}^{2} – 5)

x_{–} 

We know how each eigenvector evolves, so we can use this information to write the general form for the
state after examining j locations
(3.16) 
x_{j}  = 

x_{0} 
( 
g_{0} + 1
_{ }4^{ }

) 
^{j – 1} 

 + 
–g_{i} + 3i g_{0}
4g_{i} (3g_{0}^{2} – 5)

x_{+} 
( 
2 – g_{0} + i g_{i}
_{ }8^{ }

) 
^{j – 1} 

 + 
–g_{i} – 3i g_{0}
4g_{i} (3g_{0}^{2} – 5)

x_{–} 
( 
2 – g_{0} – i g_{i}
_{ }8^{ }

) 
^{j – 1} 


The elements of x are always real numbers. (Note
that the last two terms are complex conjugates of each other; that is, the magnitude of the real and imaginary parts are
the same, but with opposite sign for the imaginary part.) To get the answer to our question, we need to
sum the elements of x to see how the sum evolves over time.
Multiplying out and using the identities to simplify, we get
(3.17) 
x_{1} + x_{2} + x_{3} 
= 
g_{0}^{2} + 3g_{0} + 4
3g_{0}^{2} – 5

( 
^{ }g_{0} + 1
_{ }4^{ }

) 
^{j – 1} 


+ 
g_{i} (2g_{0}^{2} –
3g_{0} – 9) + i (–9g_{0}^{2}
+ 17g_{0} + 30)
2g_{i} (3g_{0}^{2} – 5)

( 
^{ }2 – g_{0} + i g_{i}
_{ }8^{ }

) 
^{j – 1} 


+ 
g_{i} (2g_{0}^{2} –
3g_{0} – 9) – i (–9g_{0}^{2}
+ 17g_{0} + 30)
2g_{i} (3g_{0}^{2} – 5)

( 
^{ }2 – g_{0} – i g_{i}
_{ }8^{ }

) 
^{j – 1} 


Now, finally, we can eliminate the complex notation with the identity
e^{i t} = cos t + i sin t.
Noting that the answer we're looking for is
1 – (x_{1} + x_{2} + x_{2}), we can write
the probability that a sequence of m ordered locations has at least one series of n = 4 differently
colored locations:
(3.18) 
p 
= 
1 – 
g_{0}^{2} + 3g_{0} + 4
3g_{0}^{2} – 5

( 
^{ }g_{0} + 1
_{ }4^{ }

) 
^{m – 1} 


– 
{ 
2g_{0}^{2} – 3g_{0} – 9
3g_{0}^{2} – 5

cos 
[ 
(m – 1) 
( 
π – Tan^{–1} 
_{ }g_{i}^{ }
g_{0} – 2

)] 




+ 
9g_{0}^{2} – 17g_{0} – 30
g_{i} (3g_{0}^{2} – 5)

sin 
[ 
(m – 1) 
( 
π – Tan^{–1} 
^{ }g_{i}
g_{0} – 2

)]} 




× 
( 
g_{0}^{2} – g_{0} – 4
16

) 
^{(m – 1)/2} 


Here we consider an infinite series of locations, where each location can be one of n
different colors. The question at hand is a variation on "How often do sequences of n different colors
occur?" Taken at face value, the answer is simply the probability that n consecutive
locations will be a different color. That is, n!/n^{n}. (There is only one set of
n colors, but there are n! different ways to order that set. Each ordering has a
1/n^{n} chance of occurring.)
This answer counts overlapping strings twice, which is OK, if that's what you mean.
For example, if there are six colors, the string ROYGVWOR would count twice—once for ROYGVW and
once for YGVWOR. My intent is to count the number of nonoverlapping sequences of n
different colors. (There is another variation of the question that I haven't solved: "What is the
probability that a given location will be part of one or more sequences of n different colors?")
Let's look first at the simple case of 2 different colors. Let's assume that a sequence of
two different locations has just occurred and we are looking for the next occurrence. When you examine
the next location, you will have a sequence of one consecutive locations which are different colors.
When you examine the second location, you will have a 50/50 chance of having a completed sequence of two
different colors. If
you don't complete the sequence, then you again have one consecutive location which is different.
Let's let x_{1} be the expectation value of the number of additional locations that
you will have to examine, if you currently have one consecutive location which is different. Then we
can write
(4.1) 
x_{1} = 1 + x_{1}/2, 
since you must examine one more location plus a 50/50 chance of examining zero additional locations or
x_{1} additional locations. In this case, we find x_{1} = 2.
We'll let i be the length of the current sequence, and x_{i} will be the expectation
value of the number of additional locations which need to be examined before reaching a state of
i = n. The answer we are looking for is x_{0}. (Actually, we are
looking for 1/x_{0}, since the problem asked for the rate.) For any value of
n, we can observe that
For the case of n = 2, we find x_{0} = 3. Now let's start
exploring the system for other values of n. For any specific value of n, we can create a
set of linear equations:
(4.3) 
x_{i} = 1 +
x_{i + 1} (n – i) / n +

i
∑
j = 1

x_{j} / n 

We note that x_{n} is equal to zero. That is, when you get a sequence of n
different locations, you terminate the search. I wrote out these equations manually (and solved
them) up through n = 4. In doing that, I observed that there is a fairly
small change from x_{i – 1} to x_{i}. We can write
(4.3) 
x_{i – 1} 
= x_{i} –
x_{i + 1} (n – i) / n +
x_{i} (n – i) / n


= x_{i} (2n – i) / n
– x_{i + 1} (n – i) / n,


since all but one term of the summation is common to both equations. This recursion formula can
be used to generate all of the x_{i} once we know two consecutive values. Since we
know one value for i = n, we can also determine all of the
x_{i} in terms of x_{n – 1}. Let's do the first few:
(4.4) 
x_{n – 2} 
= x_{n – 1} (n + 1) / n 
x_{n – 3} 
= x_{n – 1}
(n^{2} + n + 2) / n^{2}

x_{n – 4} 
= x_{n – 1}
(n^{3} + n^{2} + 2n + 6) / n^{3}

x_{n – 5} 
= x_{n – 1}
(n^{4} + n^{3} + 2n^{2} + 6n + 24)
/ n^{4}


I actually had to do this many to notice that there was a pattern, because I wasn't expecting to
see one. (I was hoping that n^{2} + n + 2 would turn out to be
something factorable.) We can now note that x_{i} and
x_{i + 1} bear some resemblance to each other. Let's assume that
x_{i –1} has the following form:
(4.5) 
x_{i – 1} =
x_{i} +
c_{i} x_{n – 1} / n^{n – i}

At this point, we can see that this form holds true for all i from n – 5 through
n – 1. We can now rewrite the recursion formula above in terms of differences between
neighboring x_{i}
(4.6) 
x_{i – 1} – x_{i} 
= (x_{i} – x_{i + 1})
(n – i) / n

c_{i}
x_{n – 1} / n^{n – i}

= (c_{i + 1}
x_{n – 1} / n^{n – i – 1})
(n – i) / n


This immediately simplifies to
(4.7) 
c_{i} = (n – i) c_{i + 1}.

Now (if you hadn't already noticed), we can see that the coefficients are factorials, and
c_{i} = (n – i)!. Let's look specifically at
x_{0} – x_{1}:
(4.8) 
x_{0} – x_{1} =
c_{1} x_{n – 1} / n^{n – 1}

Since we know that x_{0} – x_{1} is equal to one, and
we know that c_{1} is equal to (n – 1)!, so we find:
(4.9) 
x_{n – 1} =
n^{n – 1} / (n – 1)! =
n^{n} / n!

and we can write the general form for x_{i}
(4.10) 
x_{i} = 
n^{n}
n!


n – i – 1
∑
j = 0


j!
n^{ j}


As a side note, x_{n – 1} = n^{n}/n!,
which is the rate at which (potentially overlapping) sequences of n different locations occur. (It is
also one over the probability that n random locations will all be different colors.) Why?
Well, if you allow overlapping sequences, then you start looking for the next sequence when you already
have n different in a row. So when you consider the next possible sequence, it's composed of the
last n – 1 locations (which are all different), plus the next location. So the mean
number of sequences that you have to examine to find the next one with all n locations colored
differently will be x_{n – 1}. This is necessarily one over the
probability that a random sequence of n locations will all be colored differently, since the rate is
one over the mean interval.
I could have used this argument to simply state that
x_{n – 1} = n^{n}/n!, but I felt it was
better to derive it and then explain it. After all, I didn't have that kind of clarity of thought until
after I found the answer, so I felt it was fair to show how I actually got there.