WSJT-X/boost/libs/math/doc/sf/poisson_optimisation.qbk

[sect:optim Optimisation Examples}

[h4 Poisson Distribution - Optimization and Accuracy is quite complicated.

The general formula for calculating the CDF uses the incomplete gamma thus:

  return gamma_q(k+1, mean);

But the case of small integral k is *very* common, so it is worth considering optimisation.

The first obvious step is to use a finite sum of each pdf (Probability *density* function)
for each value of k to build up the cdf (*cumulative* distribution function).

This could be done using the pdf function for the distribution,
for which there are two equivalent formulae:

  return exp(-mean + log(mean) * k - lgamma(k+1));
    
  return gamma_p_derivative(k+1, mean); 

The pdf would probably be more accurate using gamma_p_derivative.

The reason is that the expression:

  -mean + log(mean) * k - lgamma(k+1)

Will produce a value much smaller than the largest of the terms, so you get 
cancellation error: and then when you pass the result to exp() which 
converts the absolute error in its argument to a relative error in the 
result (explanation available if required), you effectively amplify the 
error further still.

gamma_p_derivative is just a thin wrapper around some of the internals of 
the incomplete gamma, it does its utmost to avoid issues like this, because 
this function is responsible for virtually all of the error in the result. 
Hopefully further advances in the future might improve things even further 
(what is really needed is an 'accurate' pow(1+x) function, but that's a whole 
other story!).

But calling pdf function makes repeated, redundant, checks on the value of mean and k,

  result += pdf(dist, i);
  
so it may be faster to substitute the formula for the pdf in a summation loop

  result += exp(-mean) * pow(mean, i) / unchecked_factorial(i);

(simplified by removing casting from RealType).

Of course, mean is unchanged during this summation,
so exp(mean) should only be calculated once, outside the loop.

Optimising compilers 'might' do this, but one can easily ensure this.

Obviously too, k must be small enough that unchecked_factorial is OK:
34 is an obvious choice as the limit for 32-bit float.
For larger k, the number of iterations is like to be uneconomic.
Only experiment can determine the optimum value of k
for any particular RealType (float, double...)

But also note that 

The incomplete gamma already optimises this case
(argument "a" is a small int),
although only when the result q (1-p) would be < 0.5.

And moreover, in the above series, each term can be calculated
from the previous one much more efficiently:

cdf = sum from 0 to k of C[k]

with:

C[0] = exp(-mean)

C[N+1] = C[N] * mean / (N+1)

So hopefully that's:

     {
       RealType result = exp(-mean);
       RealType term = result;
       for(int i = 1; i <= k; ++i)
       { // cdf is sum of pdfs.
          term *= mean / i;
          result += term;
       }
       return result;
     }

This is exactly the same finite sum as used by gamma_p/gamma_q internally.

As explained previously it's only used when the result

p > 0.5 or 1-p = q < 0.5.

The slight danger when using the sum directly like this, is that if 
the mean is small and k is large then you're calculating a value ~1, so 
conceivably you might overshoot slightly.  For this and other reasons in the 
case when p < 0.5 and q > 0.5 gamma_p/gamma_q use a different (infinite but 
rapidly converging) sum, so that danger isn't present since you always 
calculate the smaller of p and q.

So... it's tempting to suggest that you just call gamma_p/gamma_q as 
required.  However, there is a slight benefit for the k = 0 case because you 
avoid all the internal logic inside gamma_p/gamma_q trying to figure out 
which method to use etc.

For the incomplete beta function, there are no simple finite sums 
available (that I know of yet anyway), so when there's a choice between a 
finite sum of the PDF and an incomplete beta call, the finite sum may indeed 
win out in that case.

[endsect][/sect:optim Optimisation Examples}

[/ 
  Copyright 2006 John Maddock and Paul A. Bristow.
  Distributed under the Boost Software License, Version 1.0.
  (See accompanying file LICENSE_1_0.txt or copy at
  http://www.boost.org/LICENSE_1_0.txt).
]
Squashed 'boost/' content from commit b4feb19f2 git-subtree-dir: boost git-subtree-split: b4feb19f287ee92d87a9624b5d36b7cf46aeadeb 2018-06-09 21:48:32 +01:00			`[sect:optim Optimisation Examples}`

			`[h4 Poisson Distribution - Optimization and Accuracy is quite complicated.`

			`The general formula for calculating the CDF uses the incomplete gamma thus:`

			`return gamma_q(k+1, mean);`

			`But the case of small integral k is very common, so it is worth considering optimisation.`

			`The first obvious step is to use a finite sum of each pdf (Probability density function)`
			`for each value of k to build up the cdf (cumulative distribution function).`

			`This could be done using the pdf function for the distribution,`
			`for which there are two equivalent formulae:`

			`return exp(-mean + log(mean) * k - lgamma(k+1));`

			`return gamma_p_derivative(k+1, mean);`

			`The pdf would probably be more accurate using gamma_p_derivative.`

			`The reason is that the expression:`

			`-mean + log(mean) * k - lgamma(k+1)`

			`Will produce a value much smaller than the largest of the terms, so you get`
			`cancellation error: and then when you pass the result to exp() which`
			`converts the absolute error in its argument to a relative error in the`
			`result (explanation available if required), you effectively amplify the`
			`error further still.`

			`gamma_p_derivative is just a thin wrapper around some of the internals of`
			`the incomplete gamma, it does its utmost to avoid issues like this, because`
			`this function is responsible for virtually all of the error in the result.`
			`Hopefully further advances in the future might improve things even further`
			`(what is really needed is an 'accurate' pow(1+x) function, but that's a whole`
			`other story!).`

			`But calling pdf function makes repeated, redundant, checks on the value of mean and k,`

			`result += pdf(dist, i);`

			`so it may be faster to substitute the formula for the pdf in a summation loop`

			`result += exp(-mean) * pow(mean, i) / unchecked_factorial(i);`

			`(simplified by removing casting from RealType).`

			`Of course, mean is unchanged during this summation,`
			`so exp(mean) should only be calculated once, outside the loop.`

			`Optimising compilers 'might' do this, but one can easily ensure this.`

			`Obviously too, k must be small enough that unchecked_factorial is OK:`
			`34 is an obvious choice as the limit for 32-bit float.`
			`For larger k, the number of iterations is like to be uneconomic.`
			`Only experiment can determine the optimum value of k`
			`for any particular RealType (float, double...)`

			`But also note that`

			`The incomplete gamma already optimises this case`
			`(argument "a" is a small int),`
			`although only when the result q (1-p) would be < 0.5.`

			`And moreover, in the above series, each term can be calculated`
			`from the previous one much more efficiently:`

			`cdf = sum from 0 to k of C[k]`

			`with:`

			`C[0] = exp(-mean)`

			`C[N+1] = C[N] * mean / (N+1)`

			`So hopefully that's:`

			`{`
			`RealType result = exp(-mean);`
			`RealType term = result;`
			`for(int i = 1; i <= k; ++i)`
			`{ // cdf is sum of pdfs.`
			`term *= mean / i;`
			`result += term;`
			`}`
			`return result;`
			`}`

			`This is exactly the same finite sum as used by gamma_p/gamma_q internally.`

			`As explained previously it's only used when the result`

			`p > 0.5 or 1-p = q < 0.5.`

			`The slight danger when using the sum directly like this, is that if`
			`the mean is small and k is large then you're calculating a value ~1, so`
			`conceivably you might overshoot slightly. For this and other reasons in the`
			`case when p < 0.5 and q > 0.5 gamma_p/gamma_q use a different (infinite but`
			`rapidly converging) sum, so that danger isn't present since you always`
			`calculate the smaller of p and q.`

			`So... it's tempting to suggest that you just call gamma_p/gamma_q as`
			`required. However, there is a slight benefit for the k = 0 case because you`
			`avoid all the internal logic inside gamma_p/gamma_q trying to figure out`
			`which method to use etc.`

			`For the incomplete beta function, there are no simple finite sums`
			`available (that I know of yet anyway), so when there's a choice between a`
			`finite sum of the PDF and an incomplete beta call, the finite sum may indeed`
			`win out in that case.`

			`[endsect][/sect:optim Optimisation Examples}`

			`[/`
			`Copyright 2006 John Maddock and Paul A. Bristow.`
			`Distributed under the Boost Software License, Version 1.0.`
			`(See accompanying file LICENSE_1_0.txt or copy at`
			`http://www.boost.org/LICENSE_1_0.txt).`
			`]`