Squashed 'boost/' content from commit b4feb19f2

git-subtree-dir: boost
git-subtree-split: b4feb19f287ee92d87a9624b5d36b7cf46aeadeb
This commit is contained in:
Bill Somerville
2018-06-09 21:48:32 +01:00
commit 4ebe6417a5
12444 changed files with 2327021 additions and 0 deletions
+73
View File
@@ -0,0 +1,73 @@
[section:relative_error Relative Error]
Given an actual value /a/ and a found value /v/ the relative error can be
calculated from:
[equation error2]
However the test programs in the library use the symmetrical form:
[equation error1]
which measures /relative difference/ and happens to be less error
prone in use since we don't have to worry which value is the "true"
result, and which is the experimental one. It guarantees to return a value
at least as large as the relative error.
Special care needs to be taken when one value is zero: we could either take the
absolute error in this case (but that's cheating as the absolute error is likely
to be very small), or we could assign a value of either 1 or infinity to the
relative error in this special case. In the test cases for the special functions
in this library, everything below a threshold is regarded as "effectively zero",
otherwise the relative error is assigned the value of 1 if only one of the terms
is zero. The threshold is currently set at `std::numeric_limits<>::min()`:
in other words all denormalised numbers are regarded as a zero.
All the test programs calculate /quantized relative error/, whereas the graphs
in this manual are produced with the /actual error/. The difference is as
follows: in the test programs, the test data is rounded to the target real type
under test when the program is compiled,
so the error observed will then be a whole number of /units in the last place/
either rounded up from the actual error, or rounded down (possibly to zero).
In contrast the /true error/ is obtained by extending
the precision of the calculated value, and then comparing to the actual value:
in this case the calculated error may be some fraction of /units in the last place/.
Note that throughout this manual and the test programs the relative error is
usually quoted in units of epsilon. However, remember that /units in the last place/
more accurately reflect the number of contaminated digits, and that relative
error can /"wobble"/ by a factor of 2 compared to /units in the last place/.
In other words: two implementations of the same function, whose
maximum relative errors differ by a factor of 2, can actually be accurate
to the same number of binary digits. You have been warned!
[h4:zero_error The Impossibility of Zero Error]
For many of the functions in this library, it is assumed that the error is
"effectively zero" if the computation can be done with a number of guard
digits. However it should be remembered that if the result is a
/transcendental number/
then as a point of principle we can never be sure that the result is accurate
to more than 1 ulp. This is an example of what
[@http://en.wikipedia.org/wiki/William_Kahan] called
[@http://en.wikipedia.org/wiki/Rounding#The_table-maker.27s_dilemma]:
consider what happens if the first guard digit is a one, and the remaining guard digits are all zero.
Do we have a tie or not? Since the only thing we can tell about a transcendental number
is that its digits have no particular pattern, we can never tell if we have a tie,
no matter how many guard digits we have. Therefore, we can never be completely sure
that the result has been rounded in the right direction. Of course, transcendental
numbers that just happen to be a tie - for however many guard digits we have - are
extremely rare, and get rarer the more guard digits we have, but even so....
Refer to the classic text
[@http://docs.sun.com/source/806-3568/ncg_goldberg.html What Every Computer Scientist Should Know About Floating-Point Arithmetic]
for more information.
[endsect][/section:relative_error Relative Error]
[/
Copyright 2006, 2012 John Maddock and Paul A. Bristow.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]
+664
View File
@@ -0,0 +1,664 @@
[section:sf_implementation Additional Implementation Notes]
The majority of the implementation notes are included with the documentation
of each function or distribution. The notes here are of a more general nature,
and reflect more the general implementation philosophy used.
[h4 Implementation philosophy]
"First be right, then be fast."
There will always be potential compromises
to be made between speed and accuracy.
It may be possible to find faster methods,
particularly for certain limited ranges of arguments,
but for most applications of math functions and distributions,
we judge that speed is rarely as important as accuracy.
So our priority is accuracy.
To permit evaluation of accuracy of the special functions,
production of extremely accurate tables of test values
has received considerable effort.
(It also required much CPU effort -
there was some danger of molten plastic dripping from the bottom of JM's laptop,
so instead, PAB's Dual-core desktop was kept 50% busy for [*days]
calculating some tables of test values!)
For a specific RealType, say `float` or `double`,
it may be possible to find approximations for some functions
that are simpler and thus faster, but less accurate
(perhaps because there are no refining iterations,
for example, when calculating inverse functions).
If these prove accurate enough to be "fit for his purpose",
then a user may substitute his custom specialization.
For example, there are approximations dating back from times
when computation was a [*lot] more expensive:
H Goldberg and H Levine, Approximate formulas for
percentage points and normalisation of t and chi squared,
Ann. Math. Stat., 17(4), 216 - 225 (Dec 1946).
A H Carter, Approximations to percentage points of the z-distribution,
Biometrika 34(2), 352 - 358 (Dec 1947).
These could still provide sufficient accuracy for some speed-critical applications.
[h4 Accuracy and Representation of Test Values]
In order to be accurate enough for as many as possible real types,
constant values are given to 50 decimal digits if available
(though many sources proved only accurate near to 64-bit double precision).
Values are specified as long double types by appending L,
unless they are exactly representable, for example integers, or binary fractions like 0.125.
This avoids the risk of loss of accuracy converting from double, the default type.
Values are used after `static_cast<RealType>(1.2345L)`
to provide the appropriate RealType for spot tests.
Functions that return constants values, like kurtosis for example, are written as
`static_cast<RealType>(-3) / 5;`
to provide the most accurate value
that the compiler can compute for the real type.
(The denominator is an integer and so will be promoted exactly).
So tests for one third, *not* exactly representable with radix two floating-point,
(should) use, for example:
`static_cast<RealType>(1) / 3;`
If a function is very sensitive to changes in input,
specifying an inexact value as input (such as 0.1) can throw
the result off by a noticeable amount: 0.1f is "wrong"
by ~1e-7 for example (because 0.1 has no exact binary representation).
That is why exact binary values - halves, quarters, and eighths etc -
are used in test code along with the occasional fraction `a/b` with `b`
a power of two (in order to ensure that the result is an exactly
representable binary value).
[h4 Tolerance of Tests]
The tolerances need to be set to the maximum of:
* Some epsilon value.
* The accuracy of the data (often only near 64-bit double).
Otherwise when long double has more digits than the test data, then no
amount of tweaking an epsilon based tolerance will work.
A common problem is when tolerances that are suitable for implementations
like Microsoft VS.NET where double and long double are the same size:
tests fail on other systems where long double is more accurate than double.
Check first that the suffix L is present, and then that the tolerance is big enough.
[h4 Handling Unsuitable Arguments]
In
[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1665.pdf Errors in Mathematical Special Functions], J. Marraffino & M. Paterno
it is proposed that signalling a domain error is mandatory
when the argument would give an mathematically undefined result.
*Guideline 1
[:A mathematical function is said to be defined at a point a = (a1, a2, . . .)
if the limits as x = (x1, x2, . . .) 'approaches a from all directions agree'.
The defined value may be any number, or +infinity, or -infinity.]
Put crudely, if the function goes to + infinity
and then emerges 'round-the-back' with - infinity,
it is NOT defined.
[:The library function which approximates a mathematical function shall signal a domain error
whenever evaluated with argument values for which the mathematical function is undefined.]
*Guideline 2
[:The library function which approximates a mathematical function
shall signal a domain error whenever evaluated with argument values
for which the mathematical function obtains a non-real value.]
This implementation is believed to follow these proposals and to assist compatibility with
['ISO/IEC 9899:1999 Programming languages - C]
and with the
[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf Draft Technical Report on C++ Library Extensions, 2005-06-24, section 5.2.1, paragraph 5].
[link math_toolkit.error_handling See also domain_error].
See __policy_ref for details of the error handling policies that should allow
a user to comply with any of these recommendations, as well as other behaviour.
See [link math_toolkit.error_handling error handling]
for a detailed explanation of the mechanism, and
[link math_toolkit.stat_tut.weg.error_eg error_handling example]
and
[@../../example/error_handling_example.cpp error_handling_example.cpp]
[caution If you enable throw but do NOT have try & catch block,
then the program will terminate with an uncaught exception and probably abort.
Therefore to get the benefit of helpful error messages, enabling *all* exceptions
*and* using try&catch is recommended for all applications.
However, for simplicity, this is not done for most examples.]
[h4 Handling of Functions that are Not Mathematically defined]
Functions that are not mathematically defined,
like the Cauchy mean, fail to compile by default.
A [link math_toolkit.pol_ref.assert_undefined policy]
allows control of this.
If the policy is to permit undefined functions, then calling them
throws a domain error, by default. But the error policy can be set
to not throw, and to return NaN instead. For example,
`#define BOOST_MATH_DOMAIN_ERROR_POLICY ignore_error`
appears before the first Boost include,
then if the un-implemented function is called,
mean(cauchy<>()) will return std::numeric_limits<T>::quiet_NaN().
[warning If `std::numeric_limits<T>::has_quiet_NaN` is false
(for example, if T is a User-defined type without NaN support),
then an exception will always be thrown when a domain error occurs.
Catching exceptions is therefore strongly recommended.]
[h4 Median of distributions]
There are many distributions for which we have been unable to find an analytic formula,
and this has deterred us from implementing
[@http://en.wikipedia.org/wiki/Median median functions], the mid-point in a list of values.
However a useful numerical approximation for distribution `dist`
is available as usual as an accessor non-member function median using `median(dist)`,
that may be evaluated (in the absence of an analytic formula) by calling
`quantile(dist, 0.5)` (this is the /mathematical/ definition of course).
[@http://www.amstat.org/publications/jse/v13n2/vonhippel.html Mean, Median, and Skew, Paul T von Hippel]
[@http://documents.wolfram.co.jp/teachersedition/MathematicaBook/24.5.html Descriptive Statistics,]
[@http://documents.wolfram.co.jp/v5/Add-onsLinks/StandardPackages/Statistics/DescriptiveStatistics.html and ]
[@http://documents.wolfram.com/v5/TheMathematicaBook/AdvancedMathematicsInMathematica/NumericalOperationsOnData/3.8.1.html
Mathematica Basic Statistics.] give more detail, in particular for discrete distributions.
[h4 Handling of Floating-Point Infinity]
Some functions and distributions are well defined with + or - infinity as
argument(s), but after some experiments with handling infinite arguments
as special cases, we concluded that it was generally more useful to forbid this,
and instead to return the result of __domain_error.
Handling infinity as special cases is additionally complicated
because, unlike built-in types on most - but not all - platforms,
not all User-Defined Types are
specialized to provide `std::numeric_limits<RealType>::infinity()`
and would return zero rather than any representation of infinity.
The rationale is that non-finiteness may happen because of error
or overflow in the users code, and it will be more helpful for this
to be diagnosed promptly rather than just continuing.
The code also became much more complicated, more error-prone,
much more work to test, and much less readable.
However in a few cases, for example normal, where we felt it obvious,
we have permitted argument(s) to be infinity,
provided infinity is implemented for the `RealType` on that implementation,
and it is supported and tested by the distribution.
The range for these distributions is set to infinity if supported by the platform,
(by testing `std::numeric_limits<RealType>::has_infinity`)
else the maximum value provided for the `RealType` by Boost.Math.
Testing for has_infinity is obviously important for arbitrary precision types
where infinity makes much less sense than for IEEE754 floating-point.
So far we have not set `support()` function (only range)
on the grounds that the PDF is uninteresting/zero for infinities.
Users who require special handling of infinity (or other specific value) can,
of course, always intercept this before calling a distribution or function
and return their own choice of value, or other behavior.
This will often be simpler than trying to handle the aftermath of the error policy.
Overflow, underflow, denorm can be handled using __error_policy.
We have also tried to catch boundary cases where the mathematical specification
would result in divide by zero or overflow and signalling these similarly.
What happens at (and near), poles can be controlled through __error_policy.
[h4 Scale, Shape and Location]
We considered adding location and scale to the list of functions, for example:
template <class RealType>
inline RealType scale(const triangular_distribution<RealType>& dist)
{
RealType lower = dist.lower();
RealType mode = dist.mode();
RealType upper = dist.upper();
RealType result; // of checks.
if(false == detail::check_triangular(BOOST_CURRENT_FUNCTION, lower, mode, upper, &result))
{
return result;
}
return (upper - lower);
}
but found that these concepts are not defined (or their definition too contentious)
for too many distributions to be generally applicable.
Because they are non-member functions, they can be added if required.
[h4 Notes on Implementation of Specific Functions & Distributions]
* Default parameters for the Triangular Distribution.
We are uncertain about the best default parameters.
Some sources suggest that the Standard Triangular Distribution has
lower = 0, mode = half and upper = 1.
However as a approximation for the normal distribution,
the most common usage, lower = -1, mode = 0 and upper = 1 would be more suitable.
[h4 Rational Approximations Used]
Some of the special functions in this library are implemented via
rational approximations. These are either taken from the literature,
or devised by John Maddock using
[link math_toolkit.internals.minimax our Remez code].
Rational rather than Polynomial approximations are used to ensure
accuracy: polynomial approximations are often wonderful up to
a certain level of accuracy, but then quite often fail to provide much greater
accuracy no matter how many more terms are added.
Our own approximations were devised either for added accuracy
(to support 128-bit long doubles for example), or because
literature methods were unavailable or under non-BSL
compatible license. Our Remez code is known to produce good
agreement with literature results in fairly simple "toy" cases.
All approximations were checked
for convergence and to ensure that
they were not ill-conditioned (the coefficients can give a
theoretically good solution, but the resulting rational function
may be un-computable at fixed precision).
Recomputing using different
Remez implementations may well produce differing coefficients: the
problem is well known to be ill conditioned in general, and our Remez implementation
often found a broad and ill-defined minima for many of these approximations
(of course for simple "toy" examples like approximating `exp` the minima
is well defined, and the coefficients should agree no matter whose Remez
implementation is used). This should not in general effect the validity
of the approximations: there's good literature supporting the idea that
coefficients can be "in error" without necessarily adversely effecting
the result. Note that "in error" has a special meaning in this context,
see [@http://front.math.ucdavis.edu/0101.5042
"Approximate construction of rational approximations and the effect
of error autocorrection.", Grigori Litvinov, eprint arXiv:math/0101042].
Therefore the coefficients still need to be accurately calculated, even if they can
be in error compared to the "true" minimax solution.
[h4 Representation of Mathematical Constants]
A macro BOOST_DEFINE_MATH_CONSTANT in constants.hpp is used
to provide high accuracy constants to mathematical functions and distributions,
since it is important to provide values uniformly for both built-in
float, double and long double types,
and for User Defined types in __multiprecision like __cpp_dec_float.
and others like NTL::quad_float and NTL::RR.
To permit calculations in this Math ToolKit and its tests, (and elsewhere)
at about 100 decimal digits with NTL::RR type,
it is obviously necessary to define constants to this accuracy.
However, some compilers do not accept decimal digits strings as long as this.
So the constant is split into two parts, with the 1st containing at least
long double precision, and the 2nd zero if not needed or known.
The 3rd part permits an exponent to be provided if necessary (use zero if none) -
the other two parameters may only contain decimal digits (and sign and decimal point),
and may NOT include an exponent like 1.234E99 (nor a trailing F or L).
The second digit string is only used if T is a User-Defined Type,
when the constant is converted to a long string literal and lexical_casted to type T.
(This is necessary because you can't use a numeric constant
since even a long double might not have enough digits).
For example, pi is defined:
BOOST_DEFINE_MATH_CONSTANT(pi,
3.141592653589793238462643383279502884197169399375105820974944,
5923078164062862089986280348253421170679821480865132823066470938446095505,
0)
And used thus:
using namespace boost::math::constants;
double diameter = 1.;
double radius = diameter * pi<double>();
or boost::math::constants::pi<NTL::RR>()
Note that it is necessary (if inconvenient) to specify the type explicitly.
So you cannot write
double p = boost::math::constants::pi<>(); // could not deduce template argument for 'T'
Neither can you write:
double p = boost::math::constants::pi; // Context does not allow for disambiguation of overloaded function
double p = boost::math::constants::pi(); // Context does not allow for disambiguation of overloaded function
[h4 Thread safety]
Reporting of error by setting `errno` should be thread-safe already
(otherwise none of the std lib math functions would be thread safe?).
If you turn on reporting of errors via exceptions, `errno` gets left unused anyway.
For normal C++ usage, the Boost.Math `static const` constants are now thread-safe so
for built-in real-number types: `float`, `double` and `long double` are all thread safe.
For User_defined types, for example, __cpp_dec_float,
the Boost.Math should also be thread-safe,
(thought we are unsure how to rigorously prove this).
(Thread safety has received attention in the C++11 Standard revision,
so hopefully all compilers will do the right thing here at some point.)
[h4 Sources of Test Data]
We found a large number of sources of test data.
We have assumed that these are /"known good"/
if they agree with the results from our test
and only consulted other sources for their /'vote'/
in the case of serious disagreement.
The accuracy, actual and claimed, vary very widely.
Only [@http://functions.wolfram.com/ Wolfram Mathematica functions]
provided a higher accuracy than
C++ double (64-bit floating-point) and was regarded as
the most-trusted source by far.
The __R provided the widest range of distributions,
but the usual Intel X86 distribution uses 64-but doubles,
so our use was limited to the 15 to 17 decimal digit accuracy.
A useful index of sources is:
[@http://www.sal.hut.fi/Teaching/Resources/ProbStat/table.html
Web-oriented Teaching Resources in Probability and Statistics]
[@http://espse.ed.psu.edu/edpsych/faculty/rhale/hale/507Mat/statlets/free/pdist.htm Statlet]:
Is a Javascript application that calculates and plots probability distributions,
and provides the most complete range of distributions:
[:Bernoulli, Binomial, discrete uniform, geometric, hypergeometric,
negative binomial, Poisson, beta, Cauchy-Lorentz, chi-sequared, Erlang,
exponential, extreme value, Fisher, gamma, Laplace, logistic,
lognormal, normal, Parteo, Student's t, triangular, uniform, and Weibull.]
It calculates pdf, cdf, survivor, log survivor, hazard, tail areas,
& critical values for 5 tail values.
It is also the only independent source found for the Weibull distribution;
unfortunately it appears to suffer from very poor accuracy in areas where
the underlying special function is known to be difficult to implement.
[h4 Testing for Invalid Parameters to Functions and Constructors]
After finding that some 'bad' parameters (like NaN) were not throwing
a `domain_error` exception as they should, a function
`check_out_of_range` (in `test_out_of_range.hpp`)
was devised by JM to check
(using Boost.Test's BOOST_CHECK_THROW macro)
that bad parameters passed to constructors and functions throw `domain_error` exceptions.
Usage is `check_out_of_range< DistributionType >(list-of-params);`
Where list-of-params is a list of *valid* parameters from which the distribution can be constructed
- ie the same number of args are passed to the function,
as are passed to the distribution constructor.
The values of the parameters are not important, but must be *valid* to pass the constructor checks;
the default values are suitable, but must be explicitly provided, for example:
check_out_of_range<extreme_value_distribution<RealType> >(1, 2);
Checks made are:
* Infinity or NaN (if available) passed in place of each of the valid params.
* Infinity or NaN (if available) as a random variable.
* Out-of-range random variable passed to pdf and cdf
(ie outside of "range(DistributionType)").
* Out-of-range probability passed to quantile function and complement.
but does *not* check finite but out-of-range parameters to the constructor
because these are specific to each distribution, for example:
BOOST_CHECK_THROW(pdf(pareto_distribution<RealType>(0, 1), 0), std::domain_error);
BOOST_CHECK_THROW(pdf(pareto_distribution<RealType>(1, 0), 0), std::domain_error);
checks `scale` and `shape` parameters are both > 0
by checking that `domain_error` exception is thrown if either are == 0.
(Use of `check_out_of_range` function may mean that some previous tests are now redundant).
It was also noted that if more than one parameter is bad,
then only the first detected will be reported by the error message.
[h4 Creating and Managing the Equations]
Equations that fit on a single line can most easily be produced by inline Quickbook code
using templates for Unicode Greek and Unicode Math symbols.
All Greek letter and small set of Math symbols is available at
/boost-path/libs/math/doc/sf_and_dist/html4_symbols.qbk
Where equations need to use more than one line, real Math editors were used.
The primary source for the equations is now
[@http://www.w3.org/Math/ MathML]: see the
*.mml files in libs\/math\/doc\/sf_and_dist\/equations\/.
These are most easily edited by a GUI editor such as
[@http://mathcast.sourceforge.net/home.html Mathcast],
please note that the equation editor supplied with Open Office
currently mangles these files and should not currently be used.
Conversion to SVG was achieved using
[@https://sourceforge.net/projects/svgmath/ SVGMath] and a command line
such as:
[pre
$for file in *.mml; do
>/cygdrive/c/Python25/python.exe 'C:\download\open\SVGMath-0.3.1\math2svg.py' \\
>>$file > $(basename $file .mml).svg
>done
]
See also the section on "Using Python to run Inkscape" and
"Using inkscape to convert scalable vector SVG files to Portable Network graphic PNG".
Note that SVGMath requires that the mml files are *not* wrapped in an XHTML
XML wrapper - this is added by Mathcast by default - one workaround is to
copy an existing mml file and then edit it with Mathcast: the existing
format should then be preserved. This is a bug in the XML parser used by
SVGMath which the author is aware of.
If necessary the XHTML wrapper can be removed with:
[pre cat filename | tr -d "\\r\\n" \| sed -e 's\/.*\\(<math\[^>\]\*>.\*<\/math>\\).\*\/\\1\/' > newfile]
Setting up fonts for SVGMath is currently rather tricky, on a Windows XP system
JM's font setup is the same as the sample config file provided with SVGMath
but with:
[pre
<!\-\- Double\-struck \-\->
<mathvariant name\="double\-struck" family\="Mathematica7, Lucida Sans Unicode"\/>
]
changed to:
[pre
<!\-\- Double\-struck \-\->
<mathvariant name\="double\-struck" family\="Lucida Sans Unicode"\/>
]
Note that unlike the sample config file supplied with SVGMath, this does not
make use of the [@http://support.wolfram.com/technotes/fonts/windows/latestfonts.html Mathematica 7 font]
as this lacks sufficient Unicode information
for it to be used with either SVGMath or XEP "as is".
Also note that the SVG files in the repository are almost certainly
Windows-specific since they reference various Windows Fonts.
PNG files can be created from the SVGs using
[@http://xmlgraphics.apache.org/batik/tools/rasterizer.html Batik]
and a command such as:
[pre java -jar 'C:\download\open\batik-1.7\batik-rasterizer.jar' -dpi 120 *.svg]
Or using Inkscape (File, Export bitmap, Drawing tab, bitmap size (default size, 100 dpi), Filename (default). png)
or Using Cygwin, a command such as:
[pre for file in *.svg; do
/cygdrive/c/progra~1/Inkscape/inkscape -d 120 -e $(cygpath -a -w $(basename $file .svg).png) $(cygpath -a -w $file);
done]
Using BASH
[pre # Convert single SVG to PNG file.
# /c/progra~1/Inkscape/inkscape -d 120 -e a.png a.svg
]
or to convert All files in folder SVG to PNG.
[pre
for file in *.svg; do
/c/progra~1/Inkscape/inkscape -d 120 -e $(basename $file .svg).png $file
done
]
Currently Inkscape seems to generate the better looking PNGs.
The PDF is generated into \pdf\math.pdf
using a command from a shell or command window with current directory
\math_toolkit\libs\math\doc\sf_and_dist, typically:
[pre bjam -a pdf >math_pdf.log]
Note that XEP will have to be configured to *use and embed*
whatever fonts are used by the SVG equations
(almost certainly editing the sample xep.xml provided by the XEP installation).
If you fail to do this you will get XEP warnings in the log file like
[pre \[warning\]could not find any font family matching "Times New Roman"; replaced by Helvetica]
(html is the default so it is generated at libs\math\doc\html\index.html
using command line >bjam -a > math_toolkit.docs.log).
<!-- Sample configuration for Windows TrueType fonts. -->
is provided in the xep.xml downloaded, but the Windows TrueType fonts are commented out.
JM's XEP config file \xep\xep.xml has the following font configuration section added:
[pre
<font\-group xml:base\="file:\/C:\/Windows\/Fonts\/" label\="Windows TrueType" embed\="true" subset\="true">
<font\-family name\="Arial">
<font><font\-data ttf\="arial.ttf"\/><\/font>
<font style\="oblique"><font\-data ttf\="ariali.ttf"\/><\/font>
<font weight\="bold"><font\-data ttf\="arialbd.ttf"\/><\/font>
<font weight\="bold" style\="oblique"><font\-data ttf\="arialbi.ttf"\/><\/font>
<\/font\-family>
<font\-family name\="Times New Roman" ligatures\="&#xFB01; &#xFB02;">
<font><font\-data ttf\="times.ttf"\/><\/font>
<font style\="italic"><font\-data ttf\="timesi.ttf"\/><\/font>
<font weight\="bold"><font\-data ttf\="timesbd.ttf"\/><\/font>
<font weight\="bold" style\="italic"><font\-data ttf\="timesbi.ttf"\/><\/font>
<\/font\-family>
<font\-family name\="Courier New">
<font><font\-data ttf\="cour.ttf"\/><\/font>
<font style\="oblique"><font\-data ttf\="couri.ttf"\/><\/font>
<font weight\="bold"><font\-data ttf\="courbd.ttf"\/><\/font>
<font weight\="bold" style\="oblique"><font\-data ttf\="courbi.ttf"\/><\/font>
<\/font\-family>
<font\-family name\="Tahoma" embed\="true">
<font><font\-data ttf\="tahoma.ttf"\/><\/font>
<font weight\="bold"><font\-data ttf\="tahomabd.ttf"\/><\/font>
<\/font\-family>
<font\-family name\="Verdana" embed\="true">
<font><font\-data ttf\="verdana.ttf"\/><\/font>
<font style\="oblique"><font\-data ttf\="verdanai.ttf"\/><\/font>
<font weight\="bold"><font\-data ttf\="verdanab.ttf"\/><\/font>
<font weight\="bold" style\="oblique"><font\-data ttf\="verdanaz.ttf"\/><\/font>
<\/font\-family>
<font\-family name\="Palatino" embed\="true" ligatures\="&#xFB00; &#xFB01; &#xFB02; &#xFB03; &#xFB04;">
<font><font\-data ttf\="pala.ttf"\/><\/font>
<font style\="italic"><font\-data ttf\="palai.ttf"\/><\/font>
<font weight\="bold"><font\-data ttf\="palab.ttf"\/><\/font>
<font weight\="bold" style\="italic"><font\-data ttf\="palabi.ttf"\/><\/font>
<\/font\-family>
<font-family name="Lucida Sans Unicode">
<!-- <font><font-data ttf="lsansuni.ttf"></font> -->
<!-- actually called l_10646.ttf on Windows 2000 and Vista Sp1 -->
<font><font-data ttf="l_10646.ttf"/></font>
</font-family>
]
PAB had to alter his because the Lucida Sans Unicode font had a different name.
Other changes are very likely to be required if you are not using Windows.
XZ authored his equations using the venerable Latex, JM converted these to
MathML using [@http://gentoo-wiki.com/HOWTO_Convert_LaTeX_to_HTML_with_MathML mxlatex].
This process is currently unreliable and required some manual intervention:
consequently Latex source is not considered a viable route for the automatic
production of SVG versions of equations.
Equations are embedded in the quickbook source using the /equation/
template defined in math.qbk. This outputs Docbook XML that looks like:
[pre
<inlinemediaobject>
<imageobject role="html">
<imagedata fileref="../equations/myfile.png"></imagedata>
</imageobject>
<imageobject role="print">
<imagedata fileref="../equations/myfile.svg"></imagedata>
</imageobject>
</inlinemediaobject>
]
MathML is not currently present in the Docbook output, or in the
generated HTML: this needs further investigation.
[h4 Producing Graphs]
Graphs were produced in SVG format and then converted to PNG's using the same
process as the equations.
The programs
`/libs/math/doc/sf_and_dist/graphs/dist_graphs.cpp`
and `/libs/math/doc/sf_and_dist/graphs/sf_graphs.cpp`
generate the SVG's directly using the
[@http://code.google.com/soc/2007/boost/about.html Google Summer of Code 2007]
project of Jacob Voytko (whose work so far,
considerably enhanced and now reasonably mature and usable, by Paul A. Bristow,
is at .\boost-sandbox\SOC\2007\visualization).
[endsect] [/section:sf_implementation Implementation Notes]
[/
Copyright 2006, 2007, 2010 John Maddock and Paul A. Bristow.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]
+246
View File
@@ -0,0 +1,246 @@
[section:lanczos The Lanczos Approximation]
[h4 Motivation]
['Why base gamma and gamma-like functions on the Lanczos approximation?]
First of all I should make clear that for the gamma function
over real numbers (as opposed to complex ones)
the Lanczos approximation (See [@http://en.wikipedia.org/wiki/Lanczos_approximation Wikipedia or ]
[@http://mathworld.wolfram.com/LanczosApproximation.html Mathworld])
appears to offer no clear advantage over more traditional methods such as
[@http://en.wikipedia.org/wiki/Stirling_approximation Stirling's approximation].
__pugh carried out an extensive comparison of the various methods available
and discovered that they were all very similar in terms of complexity
and relative error. However, the Lanczos approximation does have a couple of
properties that make it worthy of further consideration:
* The approximation has an easy to compute truncation error that holds for
all /z > 0/. In practice that means we can use the same approximation for all
/z > 0/, and be certain that no matter how large or small /z/ is, the truncation
error will /at worst/ be bounded by some finite value.
* The approximation has a form that is particularly amenable to analytic
manipulation, in particular ratios of gamma or gamma-like functions
are particularly easy to compute without resorting to logarithms.
It is the combination of these two properties that make the approximation
attractive: Stirling's approximation is highly accurate for large z, and
has some of the same analytic properties as the Lanczos approximation, but
can't easily be used across the whole range of z.
As the simplest example, consider the ratio of two gamma functions: one could
compute the result via lgamma:
exp(lgamma(a) - lgamma(b));
However, even if lgamma is uniformly accurate to 0.5ulp, the worst case
relative error in the above can easily be shown to be:
Erel > a * log(a)/2 + b * log(b)/2
For small /a/ and /b/ that's not a problem, but to put the relationship another
way: ['each time a and b increase in magnitude by a factor of 10, at least one
decimal digit of precision will be lost.]
In contrast, by analytically combining like power
terms in a ratio of Lanczos approximation's, these errors can be virtually eliminated
for small /a/ and /b/, and kept under control for very large (or very small
for that matter) /a/ and /b/. Of course, computing large powers is itself a
notoriously hard problem, but even so, analytic combinations of Lanczos
approximations can make the difference between obtaining a valid result, or
simply garbage. Refer to the implementation notes for the __beta function for
an example of this method in practice. The incomplete
[link math_toolkit.sf_gamma.igamma gamma_p gamma] and
[link math_toolkit.sf_beta.ibeta_function beta] functions
use similar analytic combinations of power terms, to combine gamma and beta
functions divided by large powers into single (simpler) expressions.
[h4 The Approximation]
The Lanczos Approximation to the Gamma Function is given by:
[equation lanczos0]
Where S[sub g](z) is an infinite sum, that is convergent for all z > 0,
and /g/ is an arbitrary parameter that controls the "shape" of the
terms in the sum which is given by:
[equation lanczos0a]
With individual coefficients defined in closed form by:
[equation lanczos0b]
However, evaluation of the sum in that form can lead to numerical instability
in the computation of the ratios of rising and falling factorials (effectively
we're multiplying by a series of numbers very close to 1, so roundoff errors
can accumulate quite rapidly).
The Lanczos approximation is therefore often written in partial fraction form
with the leading constants absorbed by the coefficients in the sum:
[equation lanczos1]
where:
[equation lanczos2]
Again parameter /g/ is an arbitrarily chosen constant, and /N/ is an arbitrarily chosen
number of terms to evaluate in the "Lanczos sum" part.
[note
Some authors
choose to define the sum from k=1 to N, and hence end up with N+1 coefficients.
This happens to confuse both the following discussion and the code (since C++
deals with half open array ranges, rather than the closed range of the sum).
This convention is consistent with __godfrey, but not __pugh, so take care
when referring to the literature in this field.]
[h4 Computing the Coefficients]
The coefficients C0..CN-1 need to be computed from /N/ and /g/
at high precision, and then stored as part of the program.
Calculation of the coefficients is performed via the method of __godfrey;
let the constants be contained in a column vector P, then:
P = D B C F
where B is an NxN matrix:
[equation lanczos4]
D is an NxN matrix:
[equation lanczos3]
C is an NxN matrix:
[equation lanczos5]
and F is an N element column vector:
[equation lanczos6]
Note than the matrices B, D and C contain all integer terms and depend
only on /N/, this product should be computed first, and then multiplied
by /F/ as the last step.
[h4 Choosing the Right Parameters]
The trick is to choose
/N/ and /g/ to give the desired level of accuracy: choosing a small value for
/g/ leads to a strictly convergent series, but one which converges only slowly.
Choosing a larger value of /g/ causes the terms in the series to be large
and\/or divergent for about the first /g-1/ terms, and to then suddenly converge
with a "crunch".
__pugh has determined the optimal
value of /g/ for /N/ in the range /1 <= N <= 60/: unfortunately in practice choosing
these values leads to cancellation errors in the Lanczos sum as the largest
term in the (alternating) series is approximately 1000 times larger than the result.
These optimal values appear not to be useful in practice unless the evaluation
can be done with a number of guard digits /and/ the coefficients are stored
at higher precision than that desired in the result. These values are best
reserved for say, computing to float precision with double precision arithmetic.
[table Optimal choices for N and g when computing with guard digits (source: Pugh)
[[Significand Size] [N] [g][Max Error]]
[[24] [6] [5.581][9.51e-12]]
[[53][13][13.144565][9.2213e-23]]
]
The alternative described by __godfrey is to perform an exhaustive
search of the /N/ and /g/ parameter space to determine the optimal combination for
a given /p/ digit floating-point type. Repeating this work found a good
approximation for double precision arithmetic (close to the one __godfrey found),
but failed to find really
good approximations for 80 or 128-bit long doubles. Further it was observed
that the approximations obtained tended to optimised for the small values
of z (1 < z < 200) used to test the implementation against the factorials.
Computing ratios of gamma functions with large arguments were observed to
suffer from error resulting from the truncation of the Lancozos series.
__pugh identified all the locations where the theoretical error of the
approximation were at a minimum, but unfortunately has published only the largest
of these minima. However, he makes the observation that the minima
coincide closely with the location where the first neglected term (a[sub N]) in the
Lanczos series S[sub g](z) changes sign. These locations are quite easy to
locate, albeit with considerable computer time. These "sweet spots" need
only be computed once, tabulated, and then searched when required for an
approximation that delivers the required precision for some fixed precision
type.
Unfortunately, following this path failed to find a really good approximation
for 128-bit long doubles, and those found for 64 and 80-bit reals required an
excessive number of terms. There are two competing issues here: high precision
requires a large value of /g/, but avoiding cancellation errors in the evaluation
requires a small /g/.
At this point note that the Lanczos sum can be converted into rational form
(a ratio of two polynomials, obtained from the partial-fraction form using
polynomial arithmetic),
and doing so changes the coefficients so that /they are all positive/. That
means that the sum in rational form can be evaluated without cancellation
error, albeit with double the number of coefficients for a given N. Repeating
the search of the "sweet spots", this time evaluating the Lanczos sum in
rational form, and testing only those "sweet spots" whose theoretical error
is less than the machine epsilon for the type being tested, yielded good
approximations for all the types tested. The optimal values found were quite
close to the best cases reported by __pugh (just slightly larger /N/ and slightly
smaller /g/ for a given precision than __pugh reports), and even though converting
to rational form doubles the number of stored coefficients, it should be
noted that half of them are integers (and therefore require less storage space)
and the approximations require a smaller /N/ than would otherwise be required,
so fewer floating point operations may be required overall.
The following table shows the optimal values for /N/ and /g/ when computing
at fixed precision. These should be taken as work in progress: there are no
values for 106-bit significand machines (Darwin long doubles & NTL quad_float),
and further optimisation of the values of /g/ may be possible.
Errors given in the table
are estimates of the error due to truncation of the Lanczos infinite series
to /N/ terms. They are calculated from the sum of the first five neglected
terms - and are known to be rather pessimistic estimates - although it is noticeable
that the best combinations of /N/ and /g/ occurred when the estimated truncation error
almost exactly matches the machine epsilon for the type in question.
[table Optimum value for N and g when computing at fixed precision
[[Significand Size][Platform/Compiler Used][N][g][Max Truncation Error]]
[[24][Win32, VC++ 7.1] [6] [1.428456135094165802001953125][9.41e-007]]
[[53][Win32, VC++ 7.1] [13] [6.024680040776729583740234375][3.23e-016]]
[[64][Suse Linux 9 IA64, gcc-3.3.3] [17] [12.2252227365970611572265625][2.34e-024]]
[[116][HP Tru64 Unix 5.1B \/ Alpha, Compaq C++ V7.1-006] [24] [20.3209821879863739013671875][4.75e-035]]
]
Finally note that the Lanczos approximation can be written as follows
by removing a factor of exp(g) from the denominator, and then dividing
all the coefficients by exp(g):
[equation lanczos7]
This form is more convenient for calculating lgamma, but for the gamma
function the division by /e/ turns a possibly exact quality into an
inexact value: this reduces accuracy in the common case that
the input is exact, and so isn't used for the gamma function.
[h4 References]
# [#godfrey]Paul Godfrey, [@http://my.fit.edu/~gabdo/gamma.txt "A note on the computation of the convergent
Lanczos complex Gamma approximation"].
# [#pugh]Glendon Ralph Pugh,
[@http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf
"An Analysis of the Lanczos Gamma Approximation"],
PhD Thesis November 2004.
# Viktor T. Toth,
[@http://www.rskey.org/gamma.htm "Calculators and the Gamma Function"].
# Mathworld, [@http://mathworld.wolfram.com/LanczosApproximation.html
The Lanczos Approximation].
[endsect][/section:lanczos The Lanczos Approximation]
[/
Copyright 2006 John Maddock and Paul A. Bristow.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]
+115
View File
@@ -0,0 +1,115 @@
[section:refs References]
[h4 General references]
(Specific detailed sources for individual functions and distributions
are given at the end of each individual section).
[@http://dlmf.nist.gov/ DLMF (NIST Digital Library of Mathematical Functions)]
is a replacement for the legendary
Abramowitz and Stegun's Handbook of Mathematical Functions (often called simply A&S),
M. Abramowitz and I. A. Stegun (Eds.) (1964)
Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables,
National Bureau of Standards Applied Mathematics Series,
U.S. Government Printing Office, Washington, D.C.
[/ __Abramowitz_Stegun]
NIST Handbook of Mathematical Functions
Edited by: Frank W. J. Olver, University of Maryland and National Institute of Standards and Technology, Maryland,
Daniel W. Lozier, National Institute of Standards and Technology, Maryland,
Ronald F. Boisvert, National Institute of Standards and Technology, Maryland,
Charles W. Clark, National Institute of Standards and Technology, Maryland and University of Maryland.
ISBN: 978-0521140638 (paperback), 9780521192255 (hardback), July 2010, Cambridge University Press.
[@http://www.itl.nist.gov/div898/handbook/index.htm NIST/SEMATECH e-Handbook of Statistical Methods]
[@http://documents.wolfram.com/mathematica/Add-onsLinks/StandardPackages/Statistics/DiscreteDistributions.html Mathematica Documentation: DiscreteDistributions]
The Wolfram Research Documentation Center is a collection of online reference materials about Mathematica, CalculationCenter, and other Wolfram Research products.
[@http://documents.wolfram.com/mathematica/Add-onsLinks/StandardPackages/Statistics/ContinuousDistributions.html Mathematica Documentation: ContinuousDistributions]
The Wolfram Research Documentation Center is a collection of online reference materials about Mathematica, CalculationCenter, and other Wolfram Research products.
Statistical Distributions (Wiley Series in Probability & Statistics) (Paperback)
by N.A.J. Hastings, Brian Peacock, Merran Evans, ISBN: 0471371246, Wiley 2000.
[@http://www.worldscibooks.com/mathematics/p191.html Extreme Value Distributions, Theory and Applications]
Samuel Kotz & Saralees Nadarajah, ISBN 978-1-86094-224-2 & 1-86094-224-5 Oct 2000,
Chapter 1.2 discusses the various extreme value distributions.
[@http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf pugh.pdf (application/pdf Object)]
Pugh Msc Thesis on the Lanczos approximation to the gamma function.
[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003 N1514, 03-0097, A Proposal to Add Mathematical Special Functions to the C++ Standard Library (version 2), Walter E. Brown]
[h4 Calculators]
We found (and used to create cross-check spot values - as far as their accuracy allowed).
[@http://functions.wolfram.com/ The Wolfram Functions Site]
The Wolfram Functions Site - Providing
the mathematical and scientific community with the world's largest
(and most authorititive) collection of formulas and graphics about mathematical functions.
[@http://www.moshier.net/cephes28.zip 100-decimal digit calculator] provided some spot values.
[@http://www.adsciengineering.com/bpdcalc/ http://www.adsciengineering.com/bpdcalc/] Binomial Probability Distribution Calculator.
[h4 Other Libraries]
[@http://www.moshier.net/#Cephes Cephes library] by Shephen Moshier and his book:
Methods and programs for mathematical functions, Stephen L B Moshier, Ellis Horwood (1989) ISBN 0745802893 0470216093 provided inspiration.
[@http://lib.stat.cmu.edu/general/cdflib CDFLIB Library of Fortran Routines for Cumulative Distribution functions.]
[@http://www.csit.fsu.edu/~burkardt/cpp_src/dcdflib/dcdflib.html DCFLIB C++ version].
[@http://www.csit.fsu.edu/~burkardt/f_src/dcdflib/dcdflib.html DCDFLIB C++ version]
DCDFLIB is a library of C++ routines, using double precision arithmetic, for evaluating cumulative probability density functions.
[@http://www.softintegration.com/docs/package/chnagstat/ http://www.softintegration.com/docs/package/chnagstat/]
[@http://www.nag.com/numeric/numerical_libraries.asp NAG] libraries.
[@http://www.mathcad.com MathCAD]
[@http://www.vni.com/products/imsl/jmsl/v30/api/com/imsl/stat/Cdf.html JMSL Numerical Library] (Java).
John F Hart, Computer Approximations, (1978) ISBN 0 088275 642-7.
William J Cody, Software Manual for the Elementary Functions, Prentice-Hall (1980) ISBN 0138220646.
Nico Temme, Special Functions, An Introduction to the Classical Functions of Mathematical Physics, Wiley, ISBN: 0471-11313-1 (1996) who also gave valueable advice.
[@http://www.cas.lancs.ac.uk/glossary_v1.1/prob.html#probdistn Statistics Glossary], Valerie Easton and John H. McColl.
[__R]
R Development Core Team (2010). R: A language and environment for
statistical computing. R Foundation for Statistical Computing,
Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
For use of R, see:
Jim Albert, Bayesian Computation with R, ISBN 978-0-387-71384-7.
[@http://www.quantnet.com/cplusplus-statistical-distributions-boost
C++ Statistical Distributions in Boost - QuantNetwork forum]
discusses using Boost.Math in finance.
[@http://www.quantnet.com/boost-and-computational-finance/ Quantnet Boost and computational finance].
Robert Demming & Daniel J. Duffy, Introduction to the C++ Boost Libraries - Volume I - Foundations
and Volume II ISBN 978-94-91028-01-4, Advanced Libraries and Applications, ISBN 978-94-91028-02-1
(to be published in 2011).
discusses application of Boost.Math, especially in finance.
[endsect] [/section:references References]
[/
Copyright 2006 John Maddock and Paul A. Bristow.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]
+377
View File
@@ -0,0 +1,377 @@
[section:remez The Remez Method]
The [@http://en.wikipedia.org/wiki/Remez_algorithm Remez algorithm]
is a methodology for locating the minimax rational approximation
to a function. This short article gives a brief overview of the method, but
it should not be regarded as a thorough theoretical treatment, for that you
should consult your favorite textbook.
Imagine that you want to approximate some function f(x) by way of a rational
function R(x), where R(x) may be either a polynomial P(x) or a ratio of two
polynomials P(x)/Q(x) (a rational function). Initially we'll concentrate on the
polynomial case, as it's by far the easier to deal with, later we'll extend
to the full rational function case.
We want to find the "best" rational approximation, where
"best" is defined to be the approximation that has the least deviation
from f(x). We can measure the deviation by way of an error function:
E[sub abs](x) = f(x) - R(x)
which is expressed in terms of absolute error, but we can equally use
relative error:
E[sub rel](x) = (f(x) - R(x)) / |f(x)|
And indeed in general we can scale the error function in any way we want, it
makes no difference to the maths, although the two forms above cover almost
every practical case that you're likely to encounter.
The minimax rational function R(x) is then defined to be the function that
yields the smallest maximal value of the error function. Chebyshev showed
that there is a unique minimax solution for R(x) that has the following
properties:
* If R(x) is a polynomial of degree N, then there are N+2 unknowns:
the N+1 coefficients of the polynomial, and maximal value of the error
function.
* The error function has N+1 roots, and N+2 extrema (minima and maxima).
* The extrema alternate in sign, and all have the same magnitude.
That means that if we know the location of the extrema of the error function
then we can write N+2 simultaneous equations:
R(x[sub i]) + (-1)[super i]E = f(x[sub i])
where E is the maximal error term, and x[sub i] are the abscissa values of the
N+2 extrema of the error function. It is then trivial to solve the simultaneous
equations to obtain the polynomial coefficients and the error term.
['Unfortunately we don't know where the extrema of the error function are located!]
[h4 The Remez Method]
The Remez method is an iterative technique which, given a broad range of
assumptions, will converge on the extrema of the error function, and therefore
the minimax solution.
In the following discussion we'll use a concrete example to illustrate
the Remez method: an approximation to the function e[super x][space] over
the range \[-1, 1\].
Before we can begin the Remez method, we must obtain an initial value
for the location of the extrema of the error function. We could "guess"
these, but a much closer first approximation can be obtained by first
constructing an interpolated polynomial approximation to f(x).
In order to obtain the N+1 coefficients of the interpolated polynomial
we need N+1 points (x[sub 0]...x[sub N]): with our interpolated form
passing through each of those points
that yields N+1 simultaneous equations:
f(x[sub i]) = P(x[sub i]) = c[sub 0] + c[sub 1]x[sub i] ... + c[sub N]x[sub i][super N]
Which can be solved for the coefficients c[sub 0]...c[sub N] in P(x).
Obviously this is not a minimax solution, indeed our only guarantee is that f(x) and
P(x) touch at N+1 locations, away from those points the error may be arbitrarily
large. However, we would clearly like this initial approximation to be as close to
f(x) as possible, and it turns out that using the zeros of an orthogonal polynomial
as the initial interpolation points is a good choice. In our example we'll use the
zeros of a Chebyshev polynomial as these are particularly easy to calculate,
interpolating for a polynomial of degree 4, and measuring /relative error/
we get the following error function:
[$../graphs/remez-2.png]
Which has a peak relative error of 1.2x10[super -3].
While this is a pretty good approximation already, judging by the
shape of the error function we can clearly do better. Before starting
on the Remez method propper, we have one more step to perform: locate
all the extrema of the error function, and store
these locations as our initial ['Chebyshev control points].
[note
In the simple case of a polynomial approximation, by interpolating through
the roots of a Chebyshev polynomial we have in fact created a ['Chebyshev
approximation] to the function: in terms of /absolute error/
this is the best a priori choice for the interpolated form we can
achieve, and typically is very close to the minimax solution.
However, if we want to optimise for /relative error/, or if the approximation
is a rational function, then the initial Chebyshev solution can be quite far
from the ideal minimax solution.
A more technical discussion of the theory involved can be found in this
[@http://math.fullerton.edu/mathews/n2003/ChebyshevPolyMod.html online course].]
[h4 Remez Step 1]
The first step in the Remez method, given our current set of
N+2 Chebyshev control points x[sub i], is to solve the N+2 simultaneous
equations:
P(x[sub i]) + (-1)[super i]E = f(x[sub i])
To obtain the error term E, and the coefficients of the polynomial P(x).
This gives us a new approximation to f(x) that has the same error /E/ at
each of the control points, and whose error function ['alternates in sign]
at the control points. This is still not necessarily the minimax
solution though: since the control points may not be at the extrema of the error
function. After this first step here's what our approximation's error
function looks like:
[$../graphs/remez-3.png]
Clearly this is still not the minimax solution since the control points
are not located at the extrema, but the maximum relative error has now
dropped to 5.6x10[super -4].
[h4 Remez Step 2]
The second step is to locate the extrema of the new approximation, which we do
in two stages: first, since the error function changes sign at each
control point, we must have N+1 roots of the error function located between
each pair of N+2 control points. Once these roots are found by standard root finding
techniques, we know that N extrema are bracketed between each pair of
roots, plus two more between the endpoints of the range and the first and last roots.
The N+2 extrema can then be found using standard function minimisation techniques.
We now have a choice: multi-point exchange, or single point exchange.
In single point exchange, we move the control point nearest to the largest extrema to
the absissa value of the extrema.
In multi-point exchange we swap all the current control points, for the locations
of the extrema.
In our example we perform multi-point exchange.
[h4 Iteration]
The Remez method then performs steps 1 and 2 above iteratively until the control
points are located at the extrema of the error function: this is then
the minimax solution.
For our current example, two more iterations converges on a minimax
solution with a peak relative error of
5x10[super -4] and an error function that looks like:
[$../graphs/remez-4.png]
[h4 Rational Approximations]
If we wish to extend the Remez method to a rational approximation of the form
f(x) = R(x) = P(x) / Q(x)
where P(x) and Q(x) are polynomials, then we proceed as before, except that now
we have N+M+2 unknowns if P(x) is of order N and Q(x) is of order M. This assumes
that Q(x) is normalised so that its leading coefficient is 1, giving
N+M+1 polynomial coefficients in total, plus the error term E.
The simultaneous equations to be solved are now:
P(x[sub i]) / Q(x[sub i]) + (-1)[super i]E = f(x[sub i])
Evaluated at the N+M+2 control points x[sub i].
Unfortunately these equations are non-linear in the error term E: we can only
solve them if we know E, and yet E is one of the unknowns!
The method usually adopted to solve these equations is an iterative one: we guess the
value of E, solve the equations to obtain a new value for E (as well as the polynomial
coefficients), then use the new value of E as the next guess. The method is
repeated until E converges on a stable value.
These complications extend the running time required for the development
of rational approximations quite considerably. It is often desirable
to obtain a rational rather than polynomial approximation none the less:
rational approximations will often match more difficult to approximate
functions, to greater accuracy, and with greater efficiency, than their
polynomial alternatives. For example, if we takes our previous example
of an approximation to e[super x], we obtained 5x10[super -4] accuracy
with an order 4 polynomial. If we move two of the unknowns into the denominator
to give a pair of order 2 polynomials, and re-minimise, then the peak relative error drops
to 8.7x10[super -5]. That's a 5 fold increase in accuracy, for the same number
of terms overall.
[h4 Practical Considerations]
Most treatises on approximation theory stop at this point. However, from
a practical point of view, most of the work involves finding the right
approximating form, and then persuading the Remez method to converge
on a solution.
So far we have used a direct approximation:
f(x) = R(x)
But this will converge to a useful approximation only if f(x) is smooth. In
addition round-off errors when evaluating the rational form mean that this
will never get closer than within a few epsilon of machine precision.
Therefore this form of direct approximation is often reserved for situations
where we want efficiency, rather than accuracy.
The first step in improving the situation is generally to split f(x) into
a dominant part that we can compute accurately by another method, and a
slowly changing remainder which can be approximated by a rational approximation.
We might be tempted to write:
f(x) = g(x) + R(x)
where g(x) is the dominant part of f(x), but if f(x)\/g(x) is approximately
constant over the interval of interest then:
f(x) = g(x)(c + R(x))
Will yield a much better solution: here /c/ is a constant that is the approximate
value of f(x)\/g(x) and R(x) is typically tiny compared to /c/. In this situation
if R(x) is optimised for absolute error, then as long as its error is small compared
to the constant /c/, that error will effectively get wiped out when R(x) is added to
/c/.
The difficult part is obviously finding the right g(x) to extract from your
function: often the asymptotic behaviour of the function will give a clue, so
for example the function __erfc becomes proportional to
e[super -x[super 2]]\/x as x becomes large. Therefore using:
erfc(z) = (C + R(x)) e[super -x[super 2]]/x
as the approximating form seems like an obvious thing to try, and does indeed
yield a useful approximation.
However, the difficulty then becomes one of converging the minimax solution.
Unfortunately, it is known that for some functions the Remez method can lead
to divergent behaviour, even when the initial starting approximation is quite good.
Furthermore, it is not uncommon for the solution obtained in the first Remez step
above to be a bad one: the equations to be solved are generally "stiff", often
very close to being singular, and assuming a solution is found at all, round-off
errors and a rapidly changing error function, can lead to a situation where the
error function does not in fact change sign at each control point as required.
If this occurs, it is fatal to the Remez method. It is also possible to
obtain solutions that are perfectly valid mathematically, but which are
quite useless computationally: either because there is an unavoidable amount
of roundoff error in the computation of the rational function, or because
the denominator has one or more roots over the interval of the approximation.
In the latter case while the approximation may have the correct limiting value at
the roots, the approximation is nonetheless useless.
Assuming that the approximation does not have any fatal errors, and that the only
issue is converging adequately on the minimax solution, the aim is to
get as close as possible to the minimax solution before beginning the Remez method.
Using the zeros of a Chebyshev polynomial for the initial interpolation is a
good start, but may not be ideal when dealing with relative errors and\/or
rational (rather than polynomial) approximations. One approach is to skew
the initial interpolation points to one end: for example if we raise the
roots of the Chebyshev polynomial to a positive power greater than 1
then the roots will be skewed towards the middle of the \[-1,1\] interval,
while a positive power less than one
will skew them towards either end. More usefully, if we initially rescale the
points over \[0,1\] and then raise to a positive power, we can skew them to the left
or right. Returning to our example of e[super x][space] over \[-1,1\], the initial
interpolated form was some way from the minimax solution:
[$../graphs/remez-2.png]
However, if we first skew the interpolation points to the left (rescale them
to \[0, 1\], raise to the power 1.3, and then rescale back to \[-1,1\]) we
reduce the error from 1.3x10[super -3][space]to 6x10[super -4]:
[$../graphs/remez-5.png]
It's clearly still not ideal, but it is only a few percent away from
our desired minimax solution (5x10[super -4]).
[h4 Remez Method Checklist]
The following lists some of the things to check if the Remez method goes wrong,
it is by no means an exhaustive list, but is provided in the hopes that it will
prove useful.
* Is the function smooth enough? Can it be better separated into
a rapidly changing part, and an asymptotic part?
* Does the function being approximated have any "blips" in it? Check
for problems as the function changes computation method, or
if a root, or an infinity has been divided out. The telltale
sign is if there is a narrow region where the Remez method will
not converge.
* Check you have enough accuracy in your calculations: remember that
the Remez method works on the difference between the approximation
and the function being approximated: so you must have more digits of
precision available than the precision of the approximation
being constructed. So for example at double precision, you
shouldn't expect to be able to get better than a float precision
approximation.
* Try skewing the initial interpolated approximation to minimise the
error before you begin the Remez steps.
* If the approximation won't converge or is ill-conditioned from one starting
location, try starting from a different location.
* If a rational function won't converge, one can minimise a polynomial
(which presents no problems), then rotate one term from the numerator to
the denominator and minimise again. In theory one can continue moving
terms one at a time from numerator to denominator, and then re-minimising,
retaining the last set of control points at each stage.
* Try using a smaller interval. It may also be possible to optimise over
one (small) interval, rescale the control points over a larger interval,
and then re-minimise.
* Keep absissa values small: use a change of variable to keep the abscissa
over, say \[0, b\], for some smallish value /b/.
[h4 References]
The original references for the Remez Method and it's extension
to rational functions are unfortunately in Russian:
Remez, E.Ya., ['Fundamentals of numerical methods for Chebyshev approximations],
"Naukova Dumka", Kiev, 1969.
Remez, E.Ya., Gavrilyuk, V.T., ['Computer development of certain approaches
to the approximate construction of solutions of Chebyshev problems
nonlinearly depending on parameters], Ukr. Mat. Zh. 12 (1960), 324-338.
Gavrilyuk, V.T., ['Generalization of the first polynomial algorithm of
E.Ya.Remez for the problem of constructing rational-fractional
Chebyshev approximations], Ukr. Mat. Zh. 16 (1961), 575-585.
Some English language sources include:
Fraser, W., Hart, J.F., ['On the computation of rational approximations
to continuous functions], Comm. of the ACM 5 (1962), 401-403, 414.
Ralston, A., ['Rational Chebyshev approximation by Remes' algorithms],
Numer.Math. 7 (1965), no. 4, 322-330.
A. Ralston, ['Rational Chebyshev approximation, Mathematical
Methods for Digital Computers v. 2] (Ralston A., Wilf H., eds.),
Wiley, New York, 1967, pp. 264-284.
Hart, J.F. e.a., ['Computer approximations], Wiley, New York a.o., 1968.
Cody, W.J., Fraser, W., Hart, J.F., ['Rational Chebyshev approximation
using linear equations], Numer.Math. 12 (1968), 242-251.
Cody, W.J., ['A survey of practical rational and polynomial
approximation of functions], SIAM Review 12 (1970), no. 3, 400-423.
Barrar, R.B., Loeb, H.J., ['On the Remez algorithm for non-linear
families], Numer.Math. 15 (1970), 382-391.
Dunham, Ch.B., ['Convergence of the Fraser-Hart algorithm for rational
Chebyshev approximation], Math. Comp. 29 (1975), no. 132, 1078-1082.
G. L. Litvinov, ['Approximate construction of rational
approximations and the effect of error autocorrection],
Russian Journal of Mathematical Physics, vol.1, No. 3, 1994.
[endsect][/section:remez The Remez Method]
[/
Copyright 2006 John Maddock and Paul A. Bristow.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]
+508
View File
@@ -0,0 +1,508 @@
[section:special_tut Tutorial: How to Write a New Special Function]
[section:special_tut_impl Implementation]
In this section, we'll provide a "recipe" for adding a new special function to this library to make life easier for
future authors wishing to contribute. We'll assume the function returns a single floating-point result, and takes
two floating-point arguments. For the sake of exposition we'll give the function the name [~my_special].
Normally, the implementation of such a function is split into two layers - a public user layer, and an internal
implementation layer that does the actual work.
The implementation layer is declared inside a `detail` namespace and has a simple signature:
namespace boost { namespace math { namespace detail {
template <class T, class Policy>
T my_special_imp(const T& a, const T&b, const Policy& pol)
{
/* Implementation goes here */
}
}}} // namespaces
We'll come back to what can go inside the implementation later, but first lets look at the user layer.
This consists of two overloads of the function, with and without a __Policy argument:
namespace boost{ namespace math{
template <class T, class U>
typename tools::promote_args<T, U>::type my_special(const T& a, const U& b);
template <class T, class U, class Policy>
typename tools::promote_args<T, U>::type my_special(const T& a, const U& b, const Policy& pol);
}} // namespaces
Note how each argument has a different template type - this allows for mixed type arguments - the return
type is computed from a traits class and is the "common type" of all the arguments after any integer
arguments have been promoted to type `double`.
The implementation of the non-policy overload is trivial:
namespace boost{ namespace math{
template <class T, class U>
inline typename tools::promote_args<T, U>::type my_special(const T& a, const U& b)
{
// Simply forward with a default policy:
return my_special(a, b, policies::policy<>();
}
}} // namespaces
The implementation of the other overload is somewhat more complex, as there's some meta-programming to do,
but from a runtime perspective is still a one-line forwarding function. Here it is with comments explaining
what each line does:
namespace boost{ namespace math{
template <class T, class U, class Policy>
inline typename tools::promote_args<T, U>::type my_special(const T& a, const U& b, const Policy& pol)
{
//
// We've found some standard library functions to misbehave if any FPU exception flags
// are set prior to their call, this code will clear those flags, then reset them
// on exit:
//
BOOST_FPU_EXCEPTION_GUARD
//
// The type of the result - the common type of T and U after
// any integer types have been promoted to double:
//
typedef typename tools::promote_args<T, U>::type result_type;
//
// The type used for the calculation. This may be a wider type than
// the result in order to ensure full precision:
//
typedef typename policies::evaluation<result_type, Policy>::type value_type;
//
// The type of the policy to forward to the actual implementation.
// We disable promotion of float and double as that's [possibly]
// happened already in the line above. Also reset to the default
// any policies we don't use (reduces code bloat if we're called
// multiple times with differing policies we don't actually use).
// Also normalise the type, again to reduce code bloat in case we're
// called multiple times with functionally identical policies that happen
// to be different types.
//
typedef typename policies::normalise<
Policy,
policies::promote_float<false>,
policies::promote_double<false>,
policies::discrete_quantile<>,
policies::assert_undefined<> >::type forwarding_policy;
//
// Whew. Now we can make the actual call to the implementation.
// Arguments are explicitly cast to the evaluation type, and the result
// passed through checked_narrowing_cast which handles things like overflow
// according to the policy passed:
//
return policies::checked_narrowing_cast<result_type, forwarding_policy>(
detail::my_special_imp(
static_cast<value_type>(a),
static_cast<value_type>(x),
forwarding_policy()),
"boost::math::my_special<%1%>(%1%, %1%)");
}
}} // namespaces
We're now almost there, we just need to flesh out the details of the implementation layer:
namespace boost { namespace math { namespace detail {
template <class T, class Policy>
T my_special_imp(const T& a, const T&b, const Policy& pol)
{
/* Implementation goes here */
}
}}} // namespaces
The following guidelines indicate what (other than basic arithmetic) can go in the implementation:
* Error conditions (for example bad arguments) should be handled by calling one of the
[link math_toolkit.error_handling.finding_more_information policy based error handlers].
* Calls to standard library functions should be made unqualified (this allows argument
dependent lookup to find standard library functions for user-defined floating point
types such as those from __multiprecision). In addition, the macro `BOOST_MATH_STD_USING`
should appear at the start of the function (note no semi-colon afterwards!) so that
all the math functions in `namespace std` are visible in the current scope.
* Calls to other special functions should be made as fully qualified calls, and include the
policy parameter as the last argument, for example `boost::math::tgamma(a, pol)`.
* Where possible, evaluation of series, continued fractions, polynomials, or root
finding should use one of the [link math_toolkit.internals_overview boiler-plate functions]. In any case, after
any iterative method, you should verify that the number of iterations did not exceed the
maximum specified in the __Policy type, and if it did terminate as a result of exceeding the
maximum, then the appropriate error handler should be called (see existing code for examples).
* Numeric constants such as [pi] etc should be obtained via a call to the [link math_toolkit.constants appropriate function],
for example: `constants::pi<T>()`.
* Where tables of coefficients are used (for example for rational approximations), care should be taken
to ensure these are initialized at program startup to ensure thread safety when using user-defined number types.
See for example the use of `erf_initializer` in [@../../include/boost/math/special_functions/erf.hpp erf.hpp].
Here are some other useful internal functions:
[table
[[function][Meaning]]
[[`policies::digits<T, Policy>()`][Returns number of binary digits in T (possible overridden by the policy).]]
[[`policies::get_max_series_iterations<Policy>()`][Maximum number of iterations for series evaluation.]]
[[`policies::get_max_root_iterations<Policy>()`][Maximum number of iterations for root finding.]]
[[`polices::get_epsilon<T, Policy>()`][Epsilon for type T, possibly overridden by the Policy.]]
[[`tools::digits<T>()`][Returns the number of binary digits in T.]]
[[`tools::max_value<T>()`][Equivalent to `std::numeric_limits<T>::max()`]]
[[`tools::min_value<T>()`][Equivalent to `std::numeric_limits<T>::min()`]]
[[`tools::log_max_value<T>()`][Equivalent to the natural logarithm of `std::numeric_limits<T>::max()`]]
[[`tools::log_min_value<T>()`][Equivalent to the natural logarithm of `std::numeric_limits<T>::min()`]]
[[`tools::epsilon<T>()`][Equivalent to `std::numeric_limits<T>::epsilon()`.]]
[[`tools::root_epsilon<T>()`][Equivalent to the square root of `std::numeric_limits<T>::epsilon()`.]]
[[`tools::forth_root_epsilon<T>()`][Equivalent to the forth root of `std::numeric_limits<T>::epsilon()`.]]
]
[endsect]
[section:special_tut_test Testing]
We work under the assumption that untested code doesn't work, so some tests for your new special function are in order,
we'll divide these up in to 3 main categories:
[h4 Spot Tests]
Spot tests consist of checking that the expected exception is generated when the inputs are in error (or
otherwise generate undefined values), and checking any special values. We can check for expected exceptions
with `BOOST_CHECK_THROW`, so for example if it's a domain error for the last parameter to be outside the range
`[0,1]` then we might have:
BOOST_CHECK_THROW(my_special(0, -0.1), std::domain_error);
BOOST_CHECK_THROW(my_special(0, 1.1), std::domain_error);
When the function has known exact values (typically integer values) we can use `BOOST_CHECK_EQUAL`:
BOOST_CHECK_EQUAL(my_special(1.0, 0.0), 0);
BOOST_CHECK_EQUAL(my_special(1.0, 1.0), 1);
When the function has known values which are not exact (from a floating point perspective) then we can use
`BOOST_CHECK_CLOSE_FRACTION`:
// Assumes 4 epsilon is as close as we can get to a true value of 2Pi:
BOOST_CHECK_CLOSE_FRACTION(my_special(0.5, 0.5), 2 * constants::pi<double>(), std::numeric_limits<double>::epsilon() * 4);
[h4 Independent Test Values]
If the function is implemented by some other known good source (for example Mathematica or it's online versions
[@http://functions.wolfram.com functions.wolfram.com] or [@http://www.wolframalpha.com www.wolframalpha.com]
then it's a good idea to sanity check our implementation by having at least one independendly generated value
for each code branch our implementation may take. To slot these in nicely with our testing framework it's best to
tabulate these like this:
// function values calculated on http://functions.wolfram.com/
static const boost::array<boost::array<T, 3>, 10> my_special_data = {{
{{ SC_(0), SC_(0), SC_(1) }},
{{ SC_(0), SC_(1), SC_(1.26606587775200833559824462521471753760767031135496220680814) }},
/* More values here... */
}};
We'll see how to use this table and the meaning of the `SC_` macro later. One important point
is to make sure that the input values have exact binary representations: so choose values such as
1.5, 1.25, 1.125 etc. This ensures that if `my_special` is unusually sensitive in one area, that
we don't get apparently large errors just because the inputs are 0.5 ulp in error.
[h4 Random Test Values]
We can generate a large number of test values to check both for future regressions, and for
accumulated rounding or cancellation error in our implementation. Ideally we would use an
independent implementation for this (for example my_special may be defined in directly terms
of other special functions but not implemented that way for performance or accuracy reasons).
Alternatively we may use our own implementation directly, but with any special cases (asymptotic
expansions etc) disabled. We have a set of [link math_toolkit.internals.test_data tools]
to generate test data directly, here's a typical example:
[import ../../example/special_data.cpp]
[special_data_example]
Typically several sets of data will be generated this way, including random values in some "normal"
range, extreme values (very large or very small), and values close to any "interesting" behaviour
of the function (singularities etc).
[h4 The Test File Header]
We split the actual test file into 2 distinct parts: a header that contains the testing code
as a series of function templates, and the actual .cpp test driver that decides which types
are tested, and sets the "expected" error rates for those types. It's done this way because:
* We want to test with both built in floating point types, and with multiprecision types.
However, both compile and runtimes with the latter can be too long for the folks who run
the tests to realistically cope with, so it makes sense to split the test into (at least)
2 parts.
* The definition of the SC_ macro used in our tables of data may differ depending on what type
we're testing (see below). Again this is largely a matter of managing compile times as large tables
of user-defined-types can take a crazy amount of time to compile with some compilers.
The test header contains 2 functions:
template <class Real, class T>
void do_test(const T& data, const char* type_name, const char* test_name);
template <class T>
void test(T, const char* type_name);
Before implementing those, we'll include the headers we'll need, and provide a default
definition for the SC_ macro:
// A couple of Boost.Test headers in case we need any BOOST_CHECK_* macros:
#include <boost/test/unit_test.hpp>
#include <boost/test/floating_point_comparison.hpp>
// Our function to test:
#include <boost/math/special_functions/my_special.hpp>
// We need boost::array for our test data, plus a few headers from
// libs/math/test that contain our testing machinary:
#include <boost/array.hpp>
#include "functor.hpp"
#include "handle_test_result.hpp"
#include "table_type.hpp"
#ifndef SC_
#define SC_(x) static_cast<typename table_type<T>::type>(BOOST_JOIN(x, L))
#endif
The easiest function to implement is the "test" function which is what we'll be calling
from the test-driver program. It simply includes the files containing the tabular
test data and calls `do_test` function for each table, along with a description of what's
being tested:
template <class T>
void test(T, const char* type_name)
{
//
// The actual test data is rather verbose, so it's in a separate file
//
// The contents are as follows, each row of data contains
// three items, input value a, input value b and my_special(a, b):
//
# include "my_special_1.ipp"
do_test<T>(my_special_1, name, "MySpecial Function: Mathematica Values");
# include "my_special_2.ipp"
do_test<T>(my_special_2, name, "MySpecial Function: Random Values");
# include "my_special_3.ipp"
do_test<T>(my_special_3, name, "MySpecial Function: Very Small Values");
}
The function `do_test` takes each table of data and calculates values for each row
of data, along with statistics for max and mean error etc, most of this is handled
by some boilerplate code:
template <class Real, class T>
void do_test(const T& data, const char* type_name, const char* test_name)
{
// Get the type of each row and each element in the rows:
typedef typename T::value_type row_type;
typedef Real value_type;
// Get a pointer to our function, we have to use a workaround here
// as some compilers require the template types to be explicitly
// specified, while others don't much like it if it is!
typedef value_type (*pg)(value_type, value_type);
#if defined(BOOST_MATH_NO_DEDUCED_FUNCTION_POINTERS)
pg funcp = boost::math::my_special<value_type, value_type>;
#else
pg funcp = boost::math::my_special;
#endif
// Somewhere to hold our results:
boost::math::tools::test_result<value_type> result;
// And some pretty printing:
std::cout << "Testing " << test_name << " with type " << type_name
<< "\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n";
//
// Test my_special against data:
//
result = boost::math::tools::test_hetero<Real>(
/* First argument is the table */
data,
/* Next comes our function pointer, plus the indexes of it's arguments in the table */
bind_func<Real>(funcp, 0, 1),
/* Then the index of the result in the table - potentially we can test several
related functions this way, each having the same input arguments, and different
output values in different indexes in the table */
extract_result<Real>(2));
//
// Finish off with some boilerplate to check the results were within the expected errors,
// and pretty print the results:
//
handle_test_result(result, data[result.worst()], result.worst(), type_name, "boost::math::my_special", test_name);
}
Now we just need to write the test driver program, at it's most basic it looks something like this:
#include <boost/math/special_functions/math_fwd.hpp>
#include <boost/math/tools/test.hpp>
#include <boost/math/tools/stats.hpp>
#include <boost/type_traits.hpp>
#include <boost/array.hpp>
#include "functor.hpp"
#include "handle_test_result.hpp"
#include "test_my_special.hpp"
BOOST_AUTO_TEST_CASE( test_main )
{
//
// Test each floating point type, plus real_concept.
// We specify the name of each type by hand as typeid(T).name()
// often gives an unreadable mangled name.
//
test(0.1F, "float");
test(0.1, "double");
//
// Testing of long double and real_concept is protected
// by some logic to disable these for unsupported
// or problem compilers.
//
#ifndef BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS
test(0.1L, "long double");
#ifndef BOOST_MATH_NO_REAL_CONCEPT_TESTS
#if !BOOST_WORKAROUND(__BORLANDC__, BOOST_TESTED_AT(0x582))
test(boost::math::concepts::real_concept(0.1), "real_concept");
#endif
#endif
#else
std::cout << "<note>The long double tests have been disabled on this platform "
"either because the long double overloads of the usual math functions are "
"not available at all, or because they are too inaccurate for these tests "
"to pass.</note>" << std::cout;
#endif
}
That's almost all there is too it - except that if the above program is run it's very likely that
all the tests will fail as the default maximum allowable error is 1 epsilon. So we'll
define a function (don't forget to call it from the start of the `test_main` above) to
up the limits to something sensible, based both on the function we're calling and on
the particular tests plus the platform and compiler:
void expected_results()
{
//
// Define the max and mean errors expected for
// various compilers and platforms.
//
const char* largest_type;
#ifndef BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS
if(boost::math::policies::digits<double, boost::math::policies::policy<> >() == boost::math::policies::digits<long double, boost::math::policies::policy<> >())
{
largest_type = "(long\\s+)?double|real_concept";
}
else
{
largest_type = "long double|real_concept";
}
#else
largest_type = "(long\\s+)?double";
#endif
//
// We call add_expected_result for each error rate we wish to adjust, these tell
// handle_test_result what level of error is acceptable. We can have as many calls
// to add_expected_result as we need, each one establishes a rule for acceptable error
// with rules set first given preference.
//
add_expected_result(
/* First argument is a regular expression to match against the name of the compiler
set in BOOST_COMPILER */
".*",
/* Second argument is a regular expression to match against the name of the
C++ standard library as set in BOOST_STDLIB */
".*",
/* Third argument is a regular expression to match against the name of the
platform as set in BOOST_PLATFORM */
".*",
/* Forth argument is the name of the type being tested, normally we will
only need to up the acceptable error rate for the widest floating
point type being tested */
largest_real,
/* Fifth argument is a regular expression to match against
the name of the group of data being tested */
"MySpecial Function:.*Small.*",
/* Sixth argument is a regular expression to match against the name
of the function being tested */
"boost::math::my_special",
/* Seventh argument is the maximum allowable error expressed in units
of machine epsilon passed as a long integer value */
50,
/* Eighth argument is the maximum allowable mean error expressed in units
of machine epsilon passed as a long integer value */
20);
}
[h4 Testing Multiprecision Types]
Testing of multiprecision types is handled by the test drivers in libs/multiprecision/test/math,
please refer to these for examples. Note that these tests are run only occationally as they take
a lot of CPU cycles to build and run.
[h4 Improving Compile Times]
As noted above, these test programs can take a while to build as we're instantiating a lot of templates
for several different types, and our test runners are already stretched to the limit, and probably
using outdated "spare" hardware. There are two things we can do to speed things up:
* Use a precompiled header.
* Use separate compilation of our special function templates.
We can make these changes by changing the list of includes from:
#include <boost/math/special_functions/math_fwd.hpp>
#include <boost/math/tools/test.hpp>
#include <boost/math/tools/stats.hpp>
#include <boost/type_traits.hpp>
#include <boost/array.hpp>
#include "functor.hpp"
#include "handle_test_result.hpp"
To just:
#include <pch_light.hpp>
And changing
#include <boost/math/special_functions/my_special.hpp>
To:
#include <boost/math/special_functions/math_fwd.hpp>
The Jamfile target that builds the test program will need the targets
test_instances//test_instances pch_light
adding to it's list of source dependencies (see the Jamfile for examples).
Finally the project in libs/math/test/test_instances will need modifying
to instantiate function `my_special`.
These changes should be made last, when `my_special` is stable and the code is in Trunk.
[h4 Concept Checks]
Our concept checks verify that your function's implementation makes no assumptions that aren't
required by our [link math_toolkit.real_concepts Real number conceptual requirements]. They also
check for various common bugs and programming traps that we've fallen into over time. To
add your function to these tests, edit libs/math/test/compile_test/instantiate.hpp to add
calls to your function: there are 7 calls to each function, each with a different purpose.
Search for something like "ibeta" or "gamm_p" and follow their examples.
[endsect]
[endsect]
[/
Copyright 2013 John Maddock.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]