Question #dec54

1 Answer
Jun 26, 2016

Approximately, arithmetic mean equals to 32.4.

Explanation:

First, the theory behind the method.

If a random variable xi takes values x_1, x_2, ...x_n with corresponding probabilities p_1, p_2, ...p_n, its Mathematical Expectation or mean, by definition, equals to
E(xi) = x_1*p_1+x_2*p_2+...+x_n*p_n = Sigma_(i in [1;n])(x_i*p_i)

Assume, numbers x_i are large enough to make this calculation inconvenient. We can always transform this formula using two constants a and h freely chosen to our liking into (assuming summation Sigma is performed for all i from 1 to n):
E(xi) = Sigma (x_i*p_i) = h*Sigma ((x_i-a)/h*p_i) + Sigma (a*p_i) =
= h*Sigma ((x_i-a)/h*p_i) + a*Sigma (p_i) =
= h*Sigma ((x_i-a)/h*p_i) + a
(since Sigma (p_i) = 1)

Now it's up to us to choose constant a and h in such a way that simplifies the calculations as much as possible.

If values x_i that our random variable xi takes are distributed with equal intervals (steps), the best results are achieved if a is chosen approximately in the middle of these numbers and h is the step.

For example, if values x_i are 10000, 20000, 30000, 40000, 50000, choosing a=30000 and h=10000 results in (x_i-a)/h to be -2,-1,0,1,2, which is a much easier to deal with than the original very large numbers.

Addressing our problem, we will have values x_i chosen as midpoint of each interval:
interval [12.5-17.5] has midpoint at 15
interval [17.5-22.5] has midpoint at 20
interval [22.5-27.5] has midpoint at 25
interval [27.5-32.5] has midpoint at 30
interval [32.5-37.5] has midpoint at 35
interval [37.5-42.5] has midpoint at 40
interval [42.5-47.5] has midpoint at 45
interval [47.5-52.5] has midpoint at 50

For number a we can choose 30 since it's somewhere in the middle of a group of interval midpoints, for number h we can choose 5 since it is, obviously, an increment from one value to another.

Probabilities our random variable takes the above values are approximated by real frequencies of taking these values. Each such frequency is a ratio of the number of times this value occurred (4 for the first interval, 20 - for the second etc.) divided by the total number of experiments N=4+20+17+15+2+5+5+2=70.

Now the mean value of our random variable is evaluated as
E(xi) = 5*((15-30)/5*4/70+(20-30)/5*20/70+
+(25-30)/5*17/70+(30-30)/5*15/70+
+(35-30)/5*2/70+(40-30)/5*5/70+
+(45-30)/5*5/70+(50-30)/5*2/70) + 30 =
= 5/70*((-3)*4+(-2)*20+(-1)*17+0*15+
+1*2+2*5+3*5+4*2) + 30 =
5/70*(-12-40-17+0+2+10+15+8)+30=
=1/14*(-34)+30=32.428571...

In this case it seems sufficient to approximate the mean as 32.4.