mirror of
				https://github.com/saitohirga/WSJT-X.git
				synced 2025-11-03 21:40:52 -05:00 
			
		
		
		
	
		
			
				
	
	
		
			665 lines
		
	
	
		
			28 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			665 lines
		
	
	
		
			28 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
[section:sf_implementation Additional Implementation Notes]
 | 
						|
 | 
						|
The majority of the implementation notes are included with the documentation
 | 
						|
of each function or distribution.  The notes here are of a more general nature,
 | 
						|
and reflect more the general implementation philosophy used.
 | 
						|
 | 
						|
[h4 Implementation philosophy]
 | 
						|
 | 
						|
"First be right, then be fast."
 | 
						|
 | 
						|
There will always be potential compromises
 | 
						|
to be made between speed and accuracy.
 | 
						|
It may be possible to find faster methods,
 | 
						|
particularly for certain limited ranges of arguments,
 | 
						|
but for most applications of math functions and distributions,
 | 
						|
we judge that speed is rarely as important as accuracy.
 | 
						|
 | 
						|
So our priority is accuracy.
 | 
						|
 | 
						|
To permit evaluation of accuracy of the special functions,
 | 
						|
production of extremely accurate tables of test values
 | 
						|
has received considerable effort.
 | 
						|
 | 
						|
(It also required much CPU effort -
 | 
						|
there was some danger of molten plastic dripping from the bottom of JM's laptop,
 | 
						|
so instead, PAB's Dual-core desktop was kept 50% busy for [*days]
 | 
						|
calculating some tables of test values!)
 | 
						|
 | 
						|
For a specific RealType, say `float` or `double`,
 | 
						|
it may be possible to find approximations for some functions
 | 
						|
that are simpler and thus faster, but less accurate
 | 
						|
(perhaps because there are no refining iterations,
 | 
						|
for example, when calculating inverse functions).
 | 
						|
 | 
						|
If these prove accurate enough to be "fit for his purpose",
 | 
						|
then a user may substitute his custom specialization.
 | 
						|
 | 
						|
For example, there are approximations dating back from times
 | 
						|
when computation was a [*lot] more expensive:
 | 
						|
 | 
						|
H Goldberg and H Levine, Approximate formulas for
 | 
						|
percentage points and normalisation of t and chi squared,
 | 
						|
Ann. Math. Stat., 17(4), 216 - 225 (Dec 1946).
 | 
						|
 | 
						|
A H Carter, Approximations to percentage points of the z-distribution,
 | 
						|
Biometrika 34(2), 352 - 358 (Dec 1947).
 | 
						|
 | 
						|
These could still provide sufficient accuracy for some speed-critical applications.
 | 
						|
 | 
						|
[h4 Accuracy and Representation of Test Values]
 | 
						|
 | 
						|
In order to be accurate enough for as many as possible real types,
 | 
						|
constant values are given to 50 decimal digits if available
 | 
						|
(though many sources proved only accurate near to 64-bit double precision).
 | 
						|
Values are specified as long double types by appending L,
 | 
						|
unless they are exactly representable, for example integers, or binary fractions like 0.125.
 | 
						|
This avoids the risk of loss of accuracy converting from double, the default type.
 | 
						|
Values are used after `static_cast<RealType>(1.2345L)`
 | 
						|
to provide the appropriate RealType for spot tests.
 | 
						|
 | 
						|
Functions that return constants values, like kurtosis for example, are written as
 | 
						|
 | 
						|
`static_cast<RealType>(-3) / 5;`
 | 
						|
 | 
						|
to provide the most accurate value
 | 
						|
that the compiler can compute for the real type.
 | 
						|
(The denominator is an integer and so will be promoted exactly).
 | 
						|
 | 
						|
So tests for one third, *not* exactly representable with radix two floating-point,
 | 
						|
(should) use, for example:
 | 
						|
 | 
						|
`static_cast<RealType>(1) / 3;`
 | 
						|
 | 
						|
If a function is very sensitive to changes in input,
 | 
						|
specifying an inexact value as input (such as 0.1) can throw
 | 
						|
the result off by a noticeable amount: 0.1f is "wrong"
 | 
						|
by ~1e-7 for example (because 0.1 has no exact binary representation).
 | 
						|
That is why exact binary values - halves, quarters, and eighths etc -
 | 
						|
are used in test code along with the occasional fraction `a/b` with `b`
 | 
						|
a power of two (in order to ensure that the result is an exactly
 | 
						|
representable binary value).
 | 
						|
 | 
						|
[h4 Tolerance of Tests]
 | 
						|
 | 
						|
The tolerances need to be set to the maximum of:
 | 
						|
 | 
						|
* Some epsilon value.
 | 
						|
* The accuracy of the data (often only near 64-bit double).
 | 
						|
 | 
						|
Otherwise when long double has more digits than the test data, then no
 | 
						|
amount of tweaking an epsilon based tolerance will work.
 | 
						|
 | 
						|
A common problem is when tolerances that are suitable for implementations
 | 
						|
like Microsoft VS.NET where double and long double are the same size:
 | 
						|
tests fail on other systems where long double is more accurate than double.
 | 
						|
Check first that the suffix L is present, and then that the tolerance is big enough.
 | 
						|
 | 
						|
[h4 Handling Unsuitable Arguments]
 | 
						|
 | 
						|
In
 | 
						|
[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1665.pdf Errors in Mathematical Special Functions], J. Marraffino & M. Paterno
 | 
						|
it is proposed that signalling a domain error is mandatory
 | 
						|
when the argument would give an mathematically undefined result.
 | 
						|
 | 
						|
*Guideline 1
 | 
						|
 | 
						|
[:A mathematical function is said to be defined at a point a = (a1, a2, . . .)
 | 
						|
if the limits as x = (x1, x2, . . .) 'approaches a from all directions agree'.
 | 
						|
The defined value may be any number, or +infinity, or -infinity.]
 | 
						|
 | 
						|
Put crudely, if the function goes to + infinity
 | 
						|
and then emerges 'round-the-back' with - infinity,
 | 
						|
it is NOT defined.
 | 
						|
 | 
						|
[:The library function which approximates a mathematical function shall signal a domain error
 | 
						|
whenever evaluated with argument values for which the mathematical function is undefined.]
 | 
						|
 | 
						|
*Guideline 2
 | 
						|
 | 
						|
[:The library function which approximates a mathematical function
 | 
						|
shall signal a domain error whenever evaluated with argument values
 | 
						|
for which the mathematical function obtains a non-real value.]
 | 
						|
 | 
						|
This implementation is believed to follow these proposals and to assist compatibility with
 | 
						|
['ISO/IEC 9899:1999 Programming languages - C]
 | 
						|
and with the
 | 
						|
[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf Draft Technical Report on C++ Library Extensions, 2005-06-24, section 5.2.1, paragraph 5].
 | 
						|
[link math_toolkit.error_handling See also domain_error].
 | 
						|
 | 
						|
See __policy_ref for details of the error handling policies that should allow
 | 
						|
a user to comply with any of these recommendations, as well as other behaviour.
 | 
						|
 | 
						|
See [link math_toolkit.error_handling error handling]
 | 
						|
for a detailed explanation of the mechanism, and
 | 
						|
[link math_toolkit.stat_tut.weg.error_eg error_handling example]
 | 
						|
and
 | 
						|
[@../../example/error_handling_example.cpp error_handling_example.cpp]
 | 
						|
 | 
						|
[caution If you enable throw but do NOT have try & catch block,
 | 
						|
then the program will terminate with an uncaught exception and probably abort.
 | 
						|
Therefore to get the benefit of helpful error messages, enabling *all* exceptions
 | 
						|
*and* using try&catch is recommended for all applications.
 | 
						|
However, for simplicity, this is not done for most examples.]
 | 
						|
 | 
						|
[h4 Handling of Functions that are Not Mathematically defined]
 | 
						|
 | 
						|
Functions that are not mathematically defined,
 | 
						|
like the Cauchy mean, fail to compile by default.
 | 
						|
A [link math_toolkit.pol_ref.assert_undefined policy]
 | 
						|
allows control of this.
 | 
						|
 | 
						|
If the policy is to permit undefined functions, then calling them
 | 
						|
throws a domain error, by default.  But the error policy can be set
 | 
						|
to not throw, and to return NaN instead.  For example,
 | 
						|
 | 
						|
`#define BOOST_MATH_DOMAIN_ERROR_POLICY ignore_error`
 | 
						|
 | 
						|
appears before the first Boost include,
 | 
						|
then if the un-implemented function is called,
 | 
						|
mean(cauchy<>()) will return std::numeric_limits<T>::quiet_NaN().
 | 
						|
 | 
						|
[warning If `std::numeric_limits<T>::has_quiet_NaN` is false
 | 
						|
(for example, if T is a User-defined type without NaN support),
 | 
						|
then an exception will always be thrown when a domain error occurs.
 | 
						|
Catching exceptions is therefore strongly recommended.]
 | 
						|
 | 
						|
[h4 Median of distributions]
 | 
						|
 | 
						|
There are many distributions for which we have been unable to find an analytic formula,
 | 
						|
and this has deterred us from implementing
 | 
						|
[@http://en.wikipedia.org/wiki/Median median functions], the mid-point in a list of values.
 | 
						|
 | 
						|
However a useful numerical approximation for distribution `dist`
 | 
						|
is available as usual as an accessor non-member function median using `median(dist)`,
 | 
						|
that may be evaluated (in the absence of an analytic formula) by calling
 | 
						|
 | 
						|
`quantile(dist, 0.5)` (this is the /mathematical/ definition of course).
 | 
						|
 | 
						|
[@http://www.amstat.org/publications/jse/v13n2/vonhippel.html Mean, Median, and Skew, Paul T von Hippel]
 | 
						|
 | 
						|
[@http://documents.wolfram.co.jp/teachersedition/MathematicaBook/24.5.html Descriptive Statistics,]
 | 
						|
 | 
						|
[@http://documents.wolfram.co.jp/v5/Add-onsLinks/StandardPackages/Statistics/DescriptiveStatistics.html and ]
 | 
						|
 | 
						|
[@http://documents.wolfram.com/v5/TheMathematicaBook/AdvancedMathematicsInMathematica/NumericalOperationsOnData/3.8.1.html
 | 
						|
Mathematica Basic Statistics.] give more detail, in particular for discrete distributions.
 | 
						|
 | 
						|
 | 
						|
[h4 Handling of Floating-Point Infinity]
 | 
						|
 | 
						|
Some functions and distributions are well defined with + or - infinity as
 | 
						|
argument(s), but after some experiments with handling infinite arguments
 | 
						|
as special cases, we concluded that it was generally more useful to forbid this,
 | 
						|
and instead to return the result of __domain_error.
 | 
						|
 | 
						|
Handling infinity as special cases is additionally complicated
 | 
						|
because, unlike built-in types on most - but not all - platforms,
 | 
						|
not all User-Defined Types are
 | 
						|
specialized to provide `std::numeric_limits<RealType>::infinity()`
 | 
						|
and would return zero rather than any representation of infinity.
 | 
						|
 | 
						|
The rationale is that non-finiteness may happen because of error
 | 
						|
or overflow in the users code, and it will be more helpful for this
 | 
						|
to be diagnosed promptly rather than just continuing.
 | 
						|
The code also became much more complicated, more error-prone,
 | 
						|
much more work to test, and much less readable.
 | 
						|
 | 
						|
However in a few cases, for example normal, where we felt it obvious,
 | 
						|
we have permitted argument(s) to be infinity,
 | 
						|
provided infinity is implemented for the `RealType` on that implementation,
 | 
						|
and it is supported and tested by the distribution.
 | 
						|
 | 
						|
The range for these distributions is set to infinity if supported by the platform,
 | 
						|
(by testing `std::numeric_limits<RealType>::has_infinity`)
 | 
						|
else the maximum value provided for the `RealType` by Boost.Math.
 | 
						|
 | 
						|
Testing for has_infinity is obviously important for arbitrary precision types
 | 
						|
where infinity makes much less sense than for IEEE754 floating-point.
 | 
						|
 | 
						|
So far we have not set `support()` function (only range)
 | 
						|
on the grounds that the PDF is uninteresting/zero for infinities.
 | 
						|
 | 
						|
Users who require special handling of infinity (or other specific value) can,
 | 
						|
of course, always intercept this before calling a distribution or function
 | 
						|
and return their own choice of value, or other behavior.
 | 
						|
This will often be simpler than trying to handle the aftermath of the error policy.
 | 
						|
 | 
						|
Overflow, underflow, denorm can be handled using __error_policy.
 | 
						|
 | 
						|
We have also tried to catch boundary cases where the mathematical specification
 | 
						|
would result in divide by zero or overflow and signalling these similarly.
 | 
						|
What happens at (and near), poles can be controlled through __error_policy.
 | 
						|
 | 
						|
[h4 Scale, Shape and Location]
 | 
						|
 | 
						|
We considered adding location and scale to the list of functions, for example:
 | 
						|
 | 
						|
  template <class RealType>
 | 
						|
  inline RealType scale(const triangular_distribution<RealType>& dist)
 | 
						|
  {
 | 
						|
    RealType lower = dist.lower();
 | 
						|
    RealType mode = dist.mode();
 | 
						|
    RealType upper = dist.upper();
 | 
						|
    RealType result;  // of checks.
 | 
						|
    if(false == detail::check_triangular(BOOST_CURRENT_FUNCTION, lower, mode, upper, &result))
 | 
						|
    {
 | 
						|
      return result;
 | 
						|
    }
 | 
						|
    return (upper - lower);
 | 
						|
  }
 | 
						|
 | 
						|
but found that these concepts are not defined (or their definition too contentious)
 | 
						|
for too many distributions to be generally applicable.
 | 
						|
Because they are non-member functions, they can be added if required.
 | 
						|
 | 
						|
[h4 Notes on Implementation of Specific Functions & Distributions]
 | 
						|
 | 
						|
* Default parameters for the Triangular Distribution.
 | 
						|
We are uncertain about the best default parameters.
 | 
						|
Some sources suggest that the Standard Triangular Distribution has
 | 
						|
lower = 0, mode = half and upper = 1.
 | 
						|
However as a approximation for the normal distribution,
 | 
						|
the most common usage, lower = -1, mode = 0 and upper = 1 would be more suitable.
 | 
						|
 | 
						|
[h4 Rational Approximations Used]
 | 
						|
 | 
						|
Some of the special functions in this library are implemented via
 | 
						|
rational approximations.  These are either taken from the literature,
 | 
						|
or devised by John Maddock using
 | 
						|
[link math_toolkit.internals.minimax our Remez code].
 | 
						|
 | 
						|
Rational rather than Polynomial approximations are used to ensure
 | 
						|
accuracy: polynomial approximations are often wonderful up to
 | 
						|
a certain level of accuracy, but then quite often fail to provide much greater
 | 
						|
accuracy no matter how many more terms are added.
 | 
						|
 | 
						|
Our own approximations were devised either for added accuracy
 | 
						|
(to support 128-bit long doubles for example), or because
 | 
						|
literature methods were unavailable or under non-BSL
 | 
						|
compatible license.  Our Remez code is known to produce good
 | 
						|
agreement with literature results in fairly simple "toy" cases.
 | 
						|
All approximations were checked
 | 
						|
for convergence and to ensure that
 | 
						|
they were not ill-conditioned (the coefficients can give a
 | 
						|
theoretically good solution, but the resulting rational function
 | 
						|
may be un-computable at fixed precision).
 | 
						|
 | 
						|
Recomputing using different
 | 
						|
Remez implementations may well produce differing coefficients: the
 | 
						|
problem is well known to be ill conditioned in general, and our Remez implementation
 | 
						|
often found a broad and ill-defined minima for many of these approximations
 | 
						|
(of course for simple "toy" examples like approximating `exp` the minima
 | 
						|
is well defined, and the coefficients should agree no matter whose Remez
 | 
						|
implementation is used).  This should not in general effect the validity
 | 
						|
of the approximations: there's good literature supporting the idea that
 | 
						|
coefficients can be "in error" without necessarily adversely effecting
 | 
						|
the result.  Note that "in error" has a special meaning in this context,
 | 
						|
see [@http://front.math.ucdavis.edu/0101.5042
 | 
						|
"Approximate construction of rational approximations and the effect
 | 
						|
of error autocorrection.", Grigori Litvinov, eprint arXiv:math/0101042].
 | 
						|
Therefore the coefficients still need to be accurately calculated, even if they can
 | 
						|
be in error compared to the "true" minimax solution.
 | 
						|
 | 
						|
[h4 Representation of Mathematical Constants]
 | 
						|
 | 
						|
A macro BOOST_DEFINE_MATH_CONSTANT in constants.hpp is used
 | 
						|
to provide high accuracy constants to mathematical functions and distributions,
 | 
						|
since it is important to provide values uniformly for both built-in
 | 
						|
float, double and long double types,
 | 
						|
and for User Defined types in __multiprecision like __cpp_dec_float.
 | 
						|
and others like NTL::quad_float and NTL::RR.
 | 
						|
 | 
						|
To permit calculations in this Math ToolKit and its tests, (and elsewhere)
 | 
						|
at about 100 decimal digits with NTL::RR type,
 | 
						|
it is obviously necessary to define constants to this accuracy.
 | 
						|
 | 
						|
However, some compilers do not accept decimal digits strings as long as this.
 | 
						|
So the constant is split into two parts, with the 1st containing at least
 | 
						|
long double precision, and the 2nd zero if not needed or known.
 | 
						|
The 3rd part permits an exponent to be provided if necessary (use zero if none) -
 | 
						|
the other two parameters may only contain decimal digits (and sign and decimal point),
 | 
						|
and may NOT include an exponent like 1.234E99 (nor a trailing F or L).
 | 
						|
The second digit string is only used if T is a User-Defined Type,
 | 
						|
when the constant is converted to a long string literal and lexical_casted to type T.
 | 
						|
(This is necessary because you can't use a numeric constant
 | 
						|
since even a long double might not have enough digits).
 | 
						|
 | 
						|
For example, pi is defined:
 | 
						|
 | 
						|
  BOOST_DEFINE_MATH_CONSTANT(pi,
 | 
						|
    3.141592653589793238462643383279502884197169399375105820974944,
 | 
						|
    5923078164062862089986280348253421170679821480865132823066470938446095505,
 | 
						|
    0)
 | 
						|
 | 
						|
And used thus:
 | 
						|
 | 
						|
  using namespace boost::math::constants;
 | 
						|
 | 
						|
  double diameter = 1.;
 | 
						|
  double radius = diameter * pi<double>();
 | 
						|
 | 
						|
  or boost::math::constants::pi<NTL::RR>()
 | 
						|
 | 
						|
Note that it is necessary (if inconvenient) to specify the type explicitly.
 | 
						|
 | 
						|
So you cannot write
 | 
						|
 | 
						|
  double p = boost::math::constants::pi<>();  // could not deduce template argument for 'T'
 | 
						|
 | 
						|
Neither can you write:
 | 
						|
 | 
						|
  double p = boost::math::constants::pi; // Context does not allow for disambiguation of overloaded function
 | 
						|
  double p = boost::math::constants::pi(); // Context does not allow for disambiguation of overloaded function
 | 
						|
 | 
						|
[h4 Thread safety]
 | 
						|
 | 
						|
Reporting of error by setting `errno` should be thread-safe already
 | 
						|
(otherwise none of the std lib math functions would be thread safe?).
 | 
						|
If you turn on reporting of errors via exceptions, `errno` gets left unused anyway.
 | 
						|
 | 
						|
For normal C++ usage, the Boost.Math `static const` constants are now thread-safe so
 | 
						|
for built-in real-number types: `float`, `double` and `long double` are all thread safe.
 | 
						|
 | 
						|
For User_defined types, for example, __cpp_dec_float,
 | 
						|
the Boost.Math should also be thread-safe,
 | 
						|
(thought we are unsure how to rigorously prove this).
 | 
						|
 | 
						|
(Thread safety has received attention in the C++11 Standard revision,
 | 
						|
so hopefully all compilers will do the right thing here at some point.)
 | 
						|
 | 
						|
[h4 Sources of Test Data]
 | 
						|
 | 
						|
We found a large number of sources of test data.
 | 
						|
We have assumed that these are /"known good"/
 | 
						|
if they agree with the results from our test
 | 
						|
and only consulted other sources for their /'vote'/
 | 
						|
in the case of serious disagreement.
 | 
						|
The accuracy, actual and claimed, vary very widely.
 | 
						|
Only [@http://functions.wolfram.com/ Wolfram Mathematica functions]
 | 
						|
provided a higher accuracy than
 | 
						|
C++ double (64-bit floating-point) and was regarded as
 | 
						|
the most-trusted source by far.
 | 
						|
The __R provided the widest range of distributions,
 | 
						|
but the usual Intel X86 distribution uses 64-but doubles,
 | 
						|
so our use was limited to the 15 to 17 decimal digit accuracy.
 | 
						|
 | 
						|
A useful index of sources is:
 | 
						|
[@http://www.sal.hut.fi/Teaching/Resources/ProbStat/table.html
 | 
						|
Web-oriented Teaching Resources in Probability and Statistics]
 | 
						|
 | 
						|
[@http://espse.ed.psu.edu/edpsych/faculty/rhale/hale/507Mat/statlets/free/pdist.htm Statlet]:
 | 
						|
Is a Javascript application that calculates and plots probability distributions,
 | 
						|
and provides the most complete range of distributions:
 | 
						|
 | 
						|
[:Bernoulli, Binomial, discrete uniform, geometric, hypergeometric,
 | 
						|
negative binomial, Poisson, beta, Cauchy-Lorentz, chi-sequared, Erlang,
 | 
						|
exponential, extreme value, Fisher, gamma, Laplace, logistic,
 | 
						|
lognormal, normal, Parteo, Student's t, triangular, uniform, and Weibull.]
 | 
						|
 | 
						|
It calculates pdf, cdf, survivor, log survivor, hazard, tail areas,
 | 
						|
& critical values for 5 tail values.
 | 
						|
 | 
						|
It is also the only independent source found for the Weibull distribution;
 | 
						|
unfortunately it appears to suffer from very poor accuracy in areas where
 | 
						|
the underlying special function is known to be difficult to implement.
 | 
						|
 | 
						|
[h4 Testing for Invalid Parameters to Functions and Constructors]
 | 
						|
 | 
						|
After finding that some 'bad' parameters (like NaN) were not throwing
 | 
						|
a `domain_error` exception as they should, a function
 | 
						|
 | 
						|
`check_out_of_range` (in `test_out_of_range.hpp`)
 | 
						|
was devised by JM to check
 | 
						|
(using Boost.Test's BOOST_CHECK_THROW macro)
 | 
						|
that bad parameters passed to constructors and functions throw `domain_error` exceptions.
 | 
						|
 | 
						|
Usage is `check_out_of_range< DistributionType >(list-of-params);`
 | 
						|
Where list-of-params is a list of *valid* parameters from which the distribution can be constructed
 | 
						|
- ie the same number of args are passed to the function,
 | 
						|
as are passed to the distribution constructor.
 | 
						|
 | 
						|
The values of the parameters are not important, but must be *valid* to pass the constructor checks;
 | 
						|
the default values are suitable, but must be explicitly provided, for example:
 | 
						|
 | 
						|
   check_out_of_range<extreme_value_distribution<RealType> >(1, 2);
 | 
						|
 | 
						|
Checks made are:
 | 
						|
 | 
						|
* Infinity or NaN (if available) passed in place of each of the valid params.
 | 
						|
* Infinity or NaN (if available) as a random variable.
 | 
						|
* Out-of-range random variable passed to pdf and cdf
 | 
						|
(ie outside of "range(DistributionType)").
 | 
						|
* Out-of-range probability passed to quantile function and complement.
 | 
						|
 | 
						|
but does *not* check finite but out-of-range parameters to the constructor
 | 
						|
because these are specific to each distribution, for example:
 | 
						|
 | 
						|
    BOOST_CHECK_THROW(pdf(pareto_distribution<RealType>(0, 1), 0), std::domain_error);
 | 
						|
    BOOST_CHECK_THROW(pdf(pareto_distribution<RealType>(1, 0), 0), std::domain_error);
 | 
						|
 | 
						|
checks `scale` and `shape` parameters are both > 0
 | 
						|
by checking that `domain_error` exception is thrown if either are == 0.
 | 
						|
 | 
						|
(Use of `check_out_of_range` function may mean that some previous tests are now redundant).
 | 
						|
 | 
						|
It was also noted that if more than one parameter is bad,
 | 
						|
then only the first detected will be reported by the error message.
 | 
						|
 | 
						|
[h4 Creating and Managing the Equations]
 | 
						|
 | 
						|
Equations that fit on a single line can most easily be produced by inline Quickbook code
 | 
						|
using templates for Unicode Greek and Unicode Math symbols.
 | 
						|
All Greek letter and small set of Math symbols is available at
 | 
						|
/boost-path/libs/math/doc/sf_and_dist/html4_symbols.qbk
 | 
						|
 | 
						|
Where equations need to use more than one line, real Math editors were used.
 | 
						|
 | 
						|
The primary source for the equations is now
 | 
						|
[@http://www.w3.org/Math/ MathML]: see the
 | 
						|
*.mml files in libs\/math\/doc\/sf_and_dist\/equations\/.
 | 
						|
 | 
						|
These are most easily edited by a GUI editor such as
 | 
						|
[@http://mathcast.sourceforge.net/home.html Mathcast],
 | 
						|
please note that the equation editor supplied with Open Office
 | 
						|
currently mangles these files and should not currently be used.
 | 
						|
 | 
						|
Conversion to SVG was achieved using
 | 
						|
[@https://sourceforge.net/projects/svgmath/ SVGMath] and a command line
 | 
						|
such as:
 | 
						|
 | 
						|
[pre
 | 
						|
$for file in *.mml; do
 | 
						|
>/cygdrive/c/Python25/python.exe 'C:\download\open\SVGMath-0.3.1\math2svg.py' \\
 | 
						|
>>$file > $(basename $file .mml).svg
 | 
						|
>done
 | 
						|
]
 | 
						|
 | 
						|
See also the section on "Using Python to run Inkscape" and
 | 
						|
"Using inkscape to convert scalable vector SVG files to Portable Network graphic PNG".
 | 
						|
 | 
						|
Note that SVGMath requires that the mml files are *not* wrapped in an XHTML
 | 
						|
XML wrapper - this is added by Mathcast by default - one workaround is to
 | 
						|
copy an existing mml file and then edit it with Mathcast: the existing
 | 
						|
format should then be preserved.  This is a bug in the XML parser used by
 | 
						|
SVGMath which the author is aware of.
 | 
						|
 | 
						|
If necessary the XHTML wrapper can be removed with:
 | 
						|
 | 
						|
[pre cat filename | tr -d "\\r\\n" \| sed -e 's\/.*\\(<math\[^>\]\*>.\*<\/math>\\).\*\/\\1\/' > newfile]
 | 
						|
 | 
						|
Setting up fonts for SVGMath is currently rather tricky, on a Windows XP system
 | 
						|
JM's font setup is the same as the sample config file provided with SVGMath
 | 
						|
but with:
 | 
						|
 | 
						|
[pre
 | 
						|
    <!\-\- Double\-struck \-\->
 | 
						|
    <mathvariant name\="double\-struck" family\="Mathematica7, Lucida Sans Unicode"\/>
 | 
						|
]
 | 
						|
 | 
						|
changed to:
 | 
						|
 | 
						|
[pre
 | 
						|
    <!\-\- Double\-struck \-\->
 | 
						|
    <mathvariant name\="double\-struck" family\="Lucida Sans Unicode"\/>
 | 
						|
]
 | 
						|
 | 
						|
Note that unlike the sample config file supplied with SVGMath, this does not
 | 
						|
make use of the [@http://support.wolfram.com/technotes/fonts/windows/latestfonts.html Mathematica 7 font]
 | 
						|
as this lacks sufficient Unicode information
 | 
						|
for it to be used with either SVGMath or XEP "as is".
 | 
						|
 | 
						|
Also note that the SVG files in the repository are almost certainly
 | 
						|
Windows-specific since they reference various Windows Fonts.
 | 
						|
 | 
						|
PNG files can be created from the SVGs using
 | 
						|
[@http://xmlgraphics.apache.org/batik/tools/rasterizer.html Batik]
 | 
						|
and a command such as:
 | 
						|
 | 
						|
[pre java -jar 'C:\download\open\batik-1.7\batik-rasterizer.jar' -dpi 120 *.svg]
 | 
						|
 | 
						|
Or using Inkscape (File, Export bitmap, Drawing tab, bitmap size (default size, 100 dpi), Filename (default). png)
 | 
						|
 | 
						|
or Using Cygwin, a command such as:
 | 
						|
 | 
						|
[pre for file in *.svg; do
 | 
						|
  /cygdrive/c/progra~1/Inkscape/inkscape -d 120 -e $(cygpath -a -w $(basename $file .svg).png) $(cygpath -a -w $file);
 | 
						|
done]
 | 
						|
 | 
						|
Using BASH
 | 
						|
 | 
						|
[pre # Convert single SVG to PNG file.
 | 
						|
# /c/progra~1/Inkscape/inkscape -d 120 -e a.png a.svg
 | 
						|
]
 | 
						|
 | 
						|
or to convert All files in folder SVG to PNG.
 | 
						|
 | 
						|
[pre
 | 
						|
for file in *.svg; do
 | 
						|
/c/progra~1/Inkscape/inkscape -d 120 -e $(basename $file .svg).png $file
 | 
						|
done
 | 
						|
]
 | 
						|
 | 
						|
Currently Inkscape seems to generate the better looking PNGs.
 | 
						|
 | 
						|
The PDF is generated into \pdf\math.pdf
 | 
						|
using a command from a shell or command window with current directory
 | 
						|
\math_toolkit\libs\math\doc\sf_and_dist, typically:
 | 
						|
 | 
						|
[pre bjam -a pdf >math_pdf.log]
 | 
						|
 | 
						|
Note that XEP will have to be configured to *use and embed*
 | 
						|
whatever fonts are used by the SVG equations
 | 
						|
(almost certainly editing the sample xep.xml provided by the XEP installation).
 | 
						|
If you fail to do this you will get XEP warnings in the log file like
 | 
						|
 | 
						|
[pre \[warning\]could not find any font family matching "Times New Roman"; replaced by Helvetica]
 | 
						|
 | 
						|
(html is the default so it is generated at libs\math\doc\html\index.html
 | 
						|
using command line >bjam -a > math_toolkit.docs.log).
 | 
						|
 | 
						|
 <!-- Sample configuration for Windows TrueType fonts.  -->
 | 
						|
is provided in the xep.xml downloaded, but the Windows TrueType fonts are commented out.
 | 
						|
 | 
						|
JM's XEP config file \xep\xep.xml has the following font configuration section added:
 | 
						|
 | 
						|
[pre
 | 
						|
    <font\-group xml:base\="file:\/C:\/Windows\/Fonts\/" label\="Windows TrueType" embed\="true" subset\="true">
 | 
						|
      <font\-family name\="Arial">
 | 
						|
        <font><font\-data ttf\="arial.ttf"\/><\/font>
 | 
						|
        <font style\="oblique"><font\-data ttf\="ariali.ttf"\/><\/font>
 | 
						|
        <font weight\="bold"><font\-data ttf\="arialbd.ttf"\/><\/font>
 | 
						|
        <font weight\="bold" style\="oblique"><font\-data ttf\="arialbi.ttf"\/><\/font>
 | 
						|
      <\/font\-family>
 | 
						|
 | 
						|
      <font\-family name\="Times New Roman" ligatures\="fi fl">
 | 
						|
        <font><font\-data ttf\="times.ttf"\/><\/font>
 | 
						|
        <font style\="italic"><font\-data ttf\="timesi.ttf"\/><\/font>
 | 
						|
        <font weight\="bold"><font\-data ttf\="timesbd.ttf"\/><\/font>
 | 
						|
        <font weight\="bold" style\="italic"><font\-data ttf\="timesbi.ttf"\/><\/font>
 | 
						|
      <\/font\-family>
 | 
						|
 | 
						|
      <font\-family name\="Courier New">
 | 
						|
        <font><font\-data ttf\="cour.ttf"\/><\/font>
 | 
						|
        <font style\="oblique"><font\-data ttf\="couri.ttf"\/><\/font>
 | 
						|
        <font weight\="bold"><font\-data ttf\="courbd.ttf"\/><\/font>
 | 
						|
        <font weight\="bold" style\="oblique"><font\-data ttf\="courbi.ttf"\/><\/font>
 | 
						|
      <\/font\-family>
 | 
						|
 | 
						|
      <font\-family name\="Tahoma" embed\="true">
 | 
						|
        <font><font\-data ttf\="tahoma.ttf"\/><\/font>
 | 
						|
        <font weight\="bold"><font\-data ttf\="tahomabd.ttf"\/><\/font>
 | 
						|
      <\/font\-family>
 | 
						|
 | 
						|
      <font\-family name\="Verdana" embed\="true">
 | 
						|
        <font><font\-data ttf\="verdana.ttf"\/><\/font>
 | 
						|
        <font style\="oblique"><font\-data ttf\="verdanai.ttf"\/><\/font>
 | 
						|
        <font weight\="bold"><font\-data ttf\="verdanab.ttf"\/><\/font>
 | 
						|
        <font weight\="bold" style\="oblique"><font\-data ttf\="verdanaz.ttf"\/><\/font>
 | 
						|
      <\/font\-family>
 | 
						|
 | 
						|
      <font\-family name\="Palatino" embed\="true" ligatures\="ff fi fl ffi ffl">
 | 
						|
        <font><font\-data ttf\="pala.ttf"\/><\/font>
 | 
						|
        <font style\="italic"><font\-data ttf\="palai.ttf"\/><\/font>
 | 
						|
        <font weight\="bold"><font\-data ttf\="palab.ttf"\/><\/font>
 | 
						|
        <font weight\="bold" style\="italic"><font\-data ttf\="palabi.ttf"\/><\/font>
 | 
						|
      <\/font\-family>
 | 
						|
 | 
						|
    <font-family name="Lucida Sans Unicode">
 | 
						|
         <!-- <font><font-data ttf="lsansuni.ttf"></font> -->
 | 
						|
         <!-- actually called l_10646.ttf on Windows 2000 and Vista Sp1 -->
 | 
						|
         <font><font-data ttf="l_10646.ttf"/></font>
 | 
						|
    </font-family>
 | 
						|
]
 | 
						|
 | 
						|
PAB had to alter his because the Lucida Sans Unicode font had a different name.
 | 
						|
Other changes are very likely to be required if you are not using Windows.
 | 
						|
 | 
						|
XZ authored his equations using the venerable Latex, JM converted these to
 | 
						|
MathML using [@http://gentoo-wiki.com/HOWTO_Convert_LaTeX_to_HTML_with_MathML mxlatex].
 | 
						|
This process is currently unreliable and required some manual intervention:
 | 
						|
consequently Latex source is not considered a viable route for the automatic
 | 
						|
production of SVG versions of equations.
 | 
						|
 | 
						|
Equations are embedded in the quickbook source using the /equation/
 | 
						|
template defined in math.qbk.  This outputs Docbook XML that looks like:
 | 
						|
 | 
						|
[pre
 | 
						|
<inlinemediaobject>
 | 
						|
<imageobject role="html">
 | 
						|
<imagedata fileref="../equations/myfile.png"></imagedata>
 | 
						|
</imageobject>
 | 
						|
<imageobject role="print">
 | 
						|
<imagedata fileref="../equations/myfile.svg"></imagedata>
 | 
						|
</imageobject>
 | 
						|
</inlinemediaobject>
 | 
						|
]
 | 
						|
 | 
						|
MathML is not currently present in the Docbook output, or in the
 | 
						|
generated HTML: this needs further investigation.
 | 
						|
 | 
						|
[h4 Producing Graphs]
 | 
						|
 | 
						|
Graphs were produced in SVG format and then converted to PNG's using the same
 | 
						|
process as the equations.
 | 
						|
 | 
						|
The programs
 | 
						|
`/libs/math/doc/sf_and_dist/graphs/dist_graphs.cpp`
 | 
						|
and `/libs/math/doc/sf_and_dist/graphs/sf_graphs.cpp`
 | 
						|
generate the SVG's directly using the
 | 
						|
[@http://code.google.com/soc/2007/boost/about.html Google Summer of Code 2007]
 | 
						|
project of Jacob Voytko (whose work so far,
 | 
						|
considerably enhanced and now reasonably mature and usable, by Paul A. Bristow,
 | 
						|
is at .\boost-sandbox\SOC\2007\visualization).
 | 
						|
 | 
						|
[endsect] [/section:sf_implementation Implementation Notes]
 | 
						|
 | 
						|
[/
 | 
						|
  Copyright 2006, 2007, 2010 John Maddock and Paul A. Bristow.
 | 
						|
  Distributed under the Boost Software License, Version 1.0.
 | 
						|
  (See accompanying file LICENSE_1_0.txt or copy at
 | 
						|
  http://www.boost.org/LICENSE_1_0.txt).
 | 
						|
]
 | 
						|
 | 
						|
 |