The German Tank Problem: Frequentist vs. Bayesian Approach
The Historical Problem:
During World War 2, the Western Allies wanted to estimate the rate at which German tanks were being produced from a paucity (statistically speaking) of sampled data. In World War II, each manufactured German tank or piece of weaponry was printed with a serial number. Using serial numbers from damaged or captured German tanks, the Allies were able to calculate the total number of tanks and other machinery in the German arsenal.
Allied mathematicians were only able to collect a limited sample of
German tanks, but used that sample as an estimator of the population maximum of
German tanks. Statistical analysis proved far more accurate than estimates based
on conventional intelligence gathering, which tended to wildly overestimate the
number of tanks produced each month.
The
problem can be approached using either frequentist
inference or Bayesian inference, leading to different results.
Estimating the population maximum based on a single sample
yields divergent results, whereas estimation based on multiple samples
is a practical estimation question whose answer is simple (especially in the
frequentist setting) but not obvious (especially in the Bayesian setting). Here we consider the single sample problem only.
The Frequentist and the Bayesian Approaches:
In
the classical or frequentist approach (parametric) we have a parameter or
population characteristic of interest, which we regard as a fixed but unknown
constant. Our objective is to estimate the parameter and infer about the value
or the range of the values that the parameter can possibly take, based on a
random sample drawn from the population of interest. Here we consider a simple
random sample of size n drawn without replacement from the population of the
tank serial numbers.
Alternatively, in the Bayesian framework, the unknown parameter is considered to be stochastic and is assigned a prior distribution quantifying our degree of belief or quantifying prior information regarding the parameter. Based on the sample data we modify or update our prior belief and obtain a posterior distribution, which is the distribution of the parameter conditional on the sample data.
Posterior
information = prior information + information from sample data.
We summarize the posterior information by using statistics such as the posterior mean or the posterior median and provide measures of accuracy using the posterior sd or the quartile deviation respectively. We perform interval estimation using Highest Posterior Density Intervals.
We shall further calculate the error of the estimates for these two approaches and try to compare them.
The Frequentist Approach:
Assume X denotes the
serial no. on a randomly selected destroyed or captured tank. It is assumed that
i.e. X follows a discrete uniform distribution. Let
Point Estimation:
The UMVUE for N is given by
Now,
Estimate of
Testing of Hypothesis:
We want to test
We reject
Interval Estimation:
Various choices
can be imagined as prior distribution for N:
• An improper
uniform prior on all positive integers, i.e.
for
and 0 ,otherwise
• A proper uniform
distribution with an upper limit k for N, i.e.
for
and 0, otherwise.
which is a shifted factorial distribution.
The posterior distribution is extremely positively skewed. The posterior mode is at
.
The posterior mean is
for 
and the posterior variance
for 
Since the posterior distribution is extremely positive skewed, quantile measures are more appropriate than moment measures to summarize the posterior information. So the appropriate estimate of N in this case is the posterior median and the measure of accuracy of the estimate is the posterior quartile deviation.
Posterior Quantiles and Highest Posterior Density (HPD) intervals:
We now turn to
the problem of calculating posterior quantiles. Let
be the q-quantile of the
posterior i.e. the smallest integer satisfying
Results obtained:
We choose N to be 1000. We obtain the required simulated data and demonstrate these computations using R .The R code used for the simulations and computations has been adopted from the paper by .
We performed the necessary computations for sample sizes n=5,10,20,50,75 and 100. The resuls are summarized in the following table :
Comparison of the two approaches:
Here we see that both the frequentist and Bayesian estimates(using an improper uniform prior) are close to the true value of the parameter on an average. Also the variability in the estimates decreases as expected, in both approaches as the sample is increased .
However we observe that the Bayesian approach always gives a lower estimate than the corresponding frequentist estimate. Further, the variability in the estimates in the two approaches, measures by the standard error in the frequentist approach and the posterior quartile deviation in the Bayesian approach. Thus the point estimates provided by the Bayesian approach is better atleast in these case.
When comparing the interval estimates provided by the two approaches, it is to kept in mind that the interpretation of the results in two approaches are different. In the frequentist approach, a 95% shortest length confidence interval ensures that the given interval on average covers the true value of the parameter in 95 % repetitions of the underlying random experiment.
In the Bayesian approach, the 95% HPD interval gives the range of the posterior distribution of the parameter covering 95% of the total area under the curve such that every point in the interval has a higher mass / density that any point outside the interval , thus giving a range of plausible values of the parameter.
The interval estimates of N can be compared for the two approaches in terms of the length of the intervals.It is observed that the frequentist approach provides shorter intervals in this case.
Comparison of errors:
The errors(=estimated value - true value) for the different sample sizes in the two approaches are compared through the following graph :
We observe that the errors are almost same for the two approaches, since the points are lying very close to the y=x line.
Comparison with other priors:
We perform similar computations using the proper uniform prior(choosing the upper limit k to be 2000, a reasonable choice) and the negative binomial prior. We refer to the paper by for details.
The results obtained using these priors are compared with the estimates obtained using the improper uniform prior are summarized in the following table:

It can be seen that these priors perform worse than the improper uniform prior in this case in terms of point estimation.
Other applications:
- The same formula was used to estimate the number of iphones sold. It was estimated that Apple had sold around 9.1 million phones to the end of September 2008.
- It was also used to estimate the total number of taxi-cabs in New York city.
References:
- Höhle, M.; Held, Leonhard (2006). "Bayesian Estimation of the Size of a Population" (PDF). Technical Report, SFB 386, No. 399, Department of Statistics, University of Munich. Retrieved 2016-04-17.
- Goodman, L. A. (1954). "Some Practical Techniques in Serial Number Analysis". Journal of the American Statistical Association. American Statistical Association. 49 (265): 97–112. doi:10.2307/2281038. JSTOR 2281038.
- Johnson, R. W. (Summer 1994). "Estimating the Size of a Population". Teaching Statistics. 16 (2): 50–52. doi:10.1111/j.1467-9639.1994.tb00688.x.
- Ruggles, R. and Brodie, H. (1947), “An empirical approach to economic intelligence in World War II,” Journal of the American Statistical Association, 42, 72–91.


Comments
Post a Comment