A conflict(?) between Frequentists and Bayesians: The Jeffreys-Lindley Paradox

The Jeffreys-Lindley paradox is an apparently puzzling problem in Statistical Inference . It has been seen often that frequentist and Bayesian approaches to the testing of a point null hypothesis i.e. a simple hypothesis , lead to divergent results especially when the sample size is large and for different choices of the prior distribution of the parameter under study.

Statement of the paradox:

The paradox can be understood in the general setting as follows:
  • Let x denote the observation or the data obtained from the experiment under study.
  • A test of significance for the null hypothesis  gets rejected at level of significance .
  • The posterior probability of , given the data x , is very high even for small prior probability of .
Lindley's original formulation of the problem in his paper "A Statistical Paradox" published in 1957 may be stated as follows:

Suppose we compare different sets of observations with varying sample sizes 'n' , all of which produce equally significant p-values (say, 0.01) when the frequentist test of significance is performed. Then, as the sample size n increases, the Bayesian approach would reveal that the data increasingly supports the null hypothesis. Thus the Bayesian approach accepts a null hypothesis which the frequentist approach rejects. 

Lindley discussed this paradox in the context of Gaussian models .The formal statement of the paradox may be stated as:

In a Gaussian Model  assume  and any regular
proper prior distribution on . Then, for any testing level , we can find a sample size  and independent, identically distributed data  such that

  • The sample mean  is significantly different from  at level 
  •   is at least as big as .
Thus it appears that the two approaches are at odds with each other regarding the acceptance or rejection of the null hypothesis based on the same set of observations.

Mathematical justification:

We proceed to show the apparent discrepancy between the two approaches in the case of the Gaussian model as stated originally by Lindley.

 Let be a random sample from a normal distribution of mean and known variance . Let the prior probability that , the value on the null hypothesis, be c. Suppose that the remainder of the prior probability is distributed uniformly over some interval  containing . We shall deal with situations where , the arithmetic mean of the observations, and a minimal sufficient statistic, is well within the interval   . 

The posterior probability that , in the light of the sample drawn, is given by
 
where , by Bayes's theorem.
By virtue of the assumption about  and  the integral can be evaluated as .

Now suppose that the value of  is such that, on performing the usual significance test
for the mean  of a normal distribution with known variance, the result is significant at
the  percentage point. That is, , where is a number dependent on 
only and can be found from tables of the normal distribution function. Putting this value
for  we have the following value for the posterior probability that 
(Note that tends to zero as n increases so that  will lie well within the interval  for
sufficiently large n.)

 We observe that as  , .i.e. the Bayesian approach will be increasingly inclined to accept the null hypothesis as the sample size n increases while the p-value remains constant, leading to the paradox.

Reasons behind the paradox:

We briefly discuss the reason behind this apparent paradox 
  • For consistent tests used in the frequentist approach, the power of the test converges to 1 as the sample size inreases. This means that even small deviations from the null hypothesis is regarded as significant resulting in a small p-value, and this is no paradoxical result since any good test should be consistent.

  • The frequentist and Bayesian approaches answer two fundamentally different questions, and the results obtained from the two approaches are misinterpreted. A small p-value obtained from the frequentist test indicates the deviation from the null hypothesis is signficant, but it does take into account the alternative hypothesis so as to conclude the alternative hypothesis is more plausible in the light of the given sample. A small p-value simply indicates that the data does not support the null hypothesis. The Bayesian approach on the other hand compares the posterior odds of the competing null and alternative hypothesis. It is to be understood that the null value to be tested is fundamentally different from the other values in the parameter space. We perform those tests only for which the null value of the parameter holds particular interest for us. Now if  is concentrated on a single point value and  is very diffuse, such that the null value of the parameter is a better fit to the data than most, but not necessarily all the values in the parameter space then the Bayesian approach concludes that the null is a better fit to the data than the alternative.

References:

  1.  Jeffreys, Harold (1939). Theory of Probability. Oxford University Press. MR 0000924.
  2.  Lindley, D.V. (1957). "A statistical paradox". Biometrika. 44 (1–2): 187–192. doi:10.1093/biomet/44.1-2.187. JSTOR 2333251.
  3. Spanos, Aris (2013). "Who should be afraid of the Jeffreys-Lindley paradox?". Philosophy of Science (journal). 80.1: 73–93. doi:10.1086/668875.

Comments

Popular posts from this blog

The German Tank Problem: Frequentist vs. Bayesian Approach

KURTOSIS- Peakedness or tailedness??

Was Hotel California written keeping Hilbert's Hotel in mind? A peek into the Infinite Hotel Paradox