added libtommath-0.14
This commit is contained in:
parent
b66471f74f
commit
82f4858291
199
bn.tex
199
bn.tex
@ -1,15 +1,15 @@
|
||||
\documentclass{article}
|
||||
\begin{document}
|
||||
|
||||
\title{LibTomMath v0.13 \\ A Free Multiple Precision Integer Library}
|
||||
\title{LibTomMath v0.14 \\ A Free Multiple Precision Integer Library \\ http://math.libtomcrypt.org }
|
||||
\author{Tom St Denis \\ tomstdenis@iahu.ca}
|
||||
\maketitle
|
||||
\newpage
|
||||
|
||||
\section{Introduction}
|
||||
``LibTomMath'' is a free and open source library that provides multiple-precision integer functions required to form a basis
|
||||
of a public key cryptosystem. LibTomMath is written entire in portable ISO C source code and designed to have an application
|
||||
interface much like that of MPI from Michael Fromberger.
|
||||
``LibTomMath'' is a free and open source library that provides multiple-precision integer functions required to form a
|
||||
basis of a public key cryptosystem. LibTomMath is written entire in portable ISO C source code and designed to have an
|
||||
application interface much like that of MPI from Michael Fromberger.
|
||||
|
||||
LibTomMath was written from scratch by Tom St Denis but designed to be drop in replacement for the MPI package. The
|
||||
algorithms within the library are derived from descriptions as provided in the Handbook of Applied Cryptography and Knuth's
|
||||
@ -23,8 +23,7 @@ LibTomMath was designed with the following goals in mind:
|
||||
\item Be written entirely in portable C.
|
||||
\end{enumerate}
|
||||
|
||||
All three goals have been achieved. Particularly the speed increase goal. For example, a 512-bit modular exponentiation
|
||||
is eight times faster\footnote{On an Athlon XP with GCC 3.2} with LibTomMath compared to MPI.
|
||||
All three goals have been achieved to one extent or another (actual figures depend on what platform you are using).
|
||||
|
||||
Being compatible with MPI means that applications that already use it can be ported fairly quickly. Currently there are
|
||||
a few differences but there are many similarities. In fact the average MPI based application can be ported in under 15
|
||||
@ -54,16 +53,26 @@ make install
|
||||
|
||||
Now within your application include ``tommath.h'' and link against libtommath.a to get MPI-like functionality.
|
||||
|
||||
\subsection{Microsoft Visual C++}
|
||||
A makefile is also provided for MSVC (\textit{tested against MSVC 6.00 with SP5}) which allows the library to be used
|
||||
with that compiler as well. To build the library type
|
||||
|
||||
\begin{verbatim}
|
||||
nmake -f makefile.msvc
|
||||
\end{verbatim}
|
||||
|
||||
Which will build ``tommath.lib''.
|
||||
|
||||
\section{Programming with LibTomMath}
|
||||
|
||||
\subsection{The mp\_int Structure}
|
||||
All multiple precision integers are stored in a structure called \textbf{mp\_int}. A multiple precision integer is
|
||||
essentially an array of \textbf{mp\_digit}. mp\_digit is defined at the top of bn.h. Its type can be changed to suit
|
||||
a particular platform.
|
||||
essentially an array of \textbf{mp\_digit}. mp\_digit is defined at the top of ``tommath.h''. The type can be changed
|
||||
to suit a particular platform.
|
||||
|
||||
For example, when \textbf{MP\_8BIT} is defined\footnote{When building bn.c.} a mp\_digit is a unsigned char and holds
|
||||
seven bits. Similarly when \textbf{MP\_16BIT} is defined a mp\_digit is a unsigned short and holds 15 bits.
|
||||
By default a mp\_digit is a unsigned long and holds 28 bits.
|
||||
For example, when \textbf{MP\_8BIT} is defined a mp\_digit is a unsigned char and holds seven bits. Similarly
|
||||
when \textbf{MP\_16BIT} is defined a mp\_digit is a unsigned short and holds 15 bits. By default a mp\_digit is a
|
||||
unsigned long and holds 28 bits which is optimal for most 32 and 64 bit processors.
|
||||
|
||||
The choice of digit is particular to the platform at hand and what available multipliers are provided. For
|
||||
MP\_8BIT either a $8 \times 8 \Rightarrow 16$ or $16 \times 16 \Rightarrow 16$ multiplier is optimal. When
|
||||
@ -83,20 +92,19 @@ $W$ is the number of bits in a digit (default is 28).
|
||||
|
||||
\subsection{Calling Functions}
|
||||
Most functions expect pointers to mp\_int's as parameters. To save on memory usage it is possible to have source
|
||||
variables as destinations. For example:
|
||||
variables as destinations. The arguements are read left to right so to compute $x + y = z$ you would pass the arguments
|
||||
in the order $x, y, z$. For example:
|
||||
\begin{verbatim}
|
||||
mp_add(&x, &y, &x); /* x = x + y */
|
||||
mp_mul(&x, &z, &x); /* x = x * z */
|
||||
mp_div_2(&x, &x); /* x = x / 2 */
|
||||
mp_mul(&y, &x, &z); /* z = y * x */
|
||||
mp_div_2(&x, &y); /* y = x / 2 */
|
||||
\end{verbatim}
|
||||
|
||||
\section{Quick Overview}
|
||||
\subsection{Return Values}
|
||||
All functions that return errors will return \textbf{MP\_OKAY} if the function was succesful. It will return
|
||||
\textbf{MP\_MEM} if it ran out of heap memory or \textbf{MP\_VAL} if one of the arguements is out of range.
|
||||
|
||||
\subsection{Basic Functionality}
|
||||
Essentially all LibTomMath functions return one of three values to indicate if the function worked as desired. A
|
||||
function will return \textbf{MP\_OKAY} if the function was successful. A function will return \textbf{MP\_MEM} if
|
||||
it ran out of memory and \textbf{MP\_VAL} if the input was invalid.
|
||||
|
||||
Before an mp\_int can be used it must be initialized with
|
||||
|
||||
\begin{verbatim}
|
||||
@ -106,7 +114,7 @@ int mp_init(mp_int *a);
|
||||
For example, consider the following.
|
||||
|
||||
\begin{verbatim}
|
||||
#include "bn.h"
|
||||
#include "tommath.h"
|
||||
int main(void)
|
||||
{
|
||||
mp_int num;
|
||||
@ -383,6 +391,18 @@ in $c$ and returns success.
|
||||
|
||||
This function requires $O(N)$ additional digits of memory and $O(2 \cdot N)$ time.
|
||||
|
||||
\subsubsection{mp\_mul\_2(mp\_int *a, mp\_int *b)}
|
||||
Multiplies $a$ by two and stores in $b$. This function is hard coded todo a shift by one place so it is faster
|
||||
than calling mp\_mul\_2d with a count of one.
|
||||
|
||||
This function requires $O(N)$ additional digits of memory and $O(N)$ time.
|
||||
|
||||
\subsubsection{mp\_div\_2(mp\_int *a, mp\_int *b)}
|
||||
Divides $a$ by two and stores in $b$. This function is hard coded todo a shift by one place so it is faster
|
||||
than calling mp\_div\_2d with a count of one.
|
||||
|
||||
This function requires $O(N)$ additional digits of memory and $O(N)$ time.
|
||||
|
||||
\subsubsection{mp\_mod\_2d(mp\_int *a, int b, mp\_int *c)}
|
||||
Performs the action of reducing $a$ modulo $2^b$ and stores the result in $c$. If the shift count $b$ is less than
|
||||
or equal to zero the function places $a$ in $c$ and returns success.
|
||||
@ -412,7 +432,7 @@ of $c$ is the maximum length of the two inputs.
|
||||
\subsection{Basic Arithmetic}
|
||||
|
||||
\subsubsection{mp\_cmp(mp\_int *a, mp\_int *b)}
|
||||
Performs a \textbf{signed} comparison between $a$ and $b$ returning \textbf{MP\_GT} is $a$ is larger than $b$.
|
||||
Performs a \textbf{signed} comparison between $a$ and $b$ returning \textbf{MP\_GT} if $a$ is larger than $b$.
|
||||
|
||||
This function requires no additional memory and $O(N)$ time.
|
||||
|
||||
@ -559,57 +579,6 @@ A very useful observation is that multiplying by $R = \beta^n$ amounts to perfor
|
||||
requires no single precision multiplications.
|
||||
|
||||
\section{Timing Analysis}
|
||||
\subsection{Observed Timings}
|
||||
A simple test program ``demo.c'' was developed which builds with either MPI or LibTomMath (without modification). The
|
||||
test was conducted on an AMD Athlon XP processor with 266Mhz DDR memory and the GCC 3.2 compiler\footnote{With build
|
||||
options ``-O3 -fomit-frame-pointer -funroll-loops''}. The multiplications and squarings were repeated 100,000 times
|
||||
each while the modular exponentiation (exptmod) were performed 50 times each. The ``inversions'' refers to multiplicative
|
||||
inversions modulo an odd number of a given size. The RDTSC (Read Time Stamp Counter) instruction was used to measure the
|
||||
time the entire iterations took and was divided by the number of iterations to get an average. The following results
|
||||
were observed.
|
||||
|
||||
\begin{small}
|
||||
\begin{center}
|
||||
\begin{tabular}{c|c|c|c}
|
||||
\hline \textbf{Operation} & \textbf{Size (bits)} & \textbf{Time with MPI (cycles)} & \textbf{Time with LibTomMath (cycles)} \\
|
||||
\hline
|
||||
Inversion & 128 & 264,083 & 59,782 \\
|
||||
Inversion & 256 & 549,370 & 146,915 \\
|
||||
Inversion & 512 & 1,675,975 & 367,172 \\
|
||||
Inversion & 1024 & 5,237,957 & 1,054,158 \\
|
||||
Inversion & 2048 & 17,871,944 & 3,459,683 \\
|
||||
Inversion & 4096 & 66,610,468 & 11,834,556 \\
|
||||
\hline
|
||||
Multiply & 128 & 1,426 & 451 \\
|
||||
Multiply & 256 & 2,551 & 958 \\
|
||||
Multiply & 512 & 7,913 & 2,476 \\
|
||||
Multiply & 1024 & 28,496 & 7,927 \\
|
||||
Multiply & 2048 & 109,897 & 28,224 \\
|
||||
Multiply & 4096 & 469,970 & 101,171 \\
|
||||
\hline
|
||||
Square & 128 & 1,319 & 511 \\
|
||||
Square & 256 & 1,776 & 947 \\
|
||||
Square & 512 & 5,399 & 2,153 \\
|
||||
Square & 1024 & 18,991 & 5,733 \\
|
||||
Square & 2048 & 72,126 & 17,621 \\
|
||||
Square & 4096 & 306,269 & 67,576 \\
|
||||
\hline
|
||||
Exptmod & 512 & 32,021,586 & 3,118,435 \\
|
||||
Exptmod & 768 & 97,595,492 & 8,493,633 \\
|
||||
Exptmod & 1024 & 223,302,532 & 17,715,899 \\
|
||||
Exptmod & 2048 & 1,682,223,369 & 114,936,361 \\
|
||||
Exptmod & 2560 & 3,268,615,571 & 229,402,426 \\
|
||||
Exptmod & 3072 & 5,597,240,141 & 367,403,840 \\
|
||||
Exptmod & 4096 & 13,347,270,891 & 779,058,433
|
||||
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
\end{small}
|
||||
|
||||
Note that the figures do fluctuate but their magnitudes are relatively intact. The purpose of the chart is not to
|
||||
get an exact timing but to compare the two libraries. For example, in all of the tests the exact time for a 512-bit
|
||||
squaring operation was not the same. The observed times were all approximately 2,500 cycles, more importantly they
|
||||
were always faster than the timings observed with MPI by about the same magnitude.
|
||||
|
||||
\subsection{Digit Size}
|
||||
The first major constribution to the time savings is the fact that 28 bits are stored per digit instead of the MPI
|
||||
@ -619,29 +588,59 @@ A savings of $64^2 - 37^2 = 2727$ single precision multiplications.
|
||||
|
||||
\subsection{Multiplication Algorithms}
|
||||
For most inputs a typical baseline $O(n^2)$ multiplier is used which is similar to that of MPI. There are two variants
|
||||
of the baseline multiplier. The normal and the fast variants. The normal baseline multiplier is the exact same as the
|
||||
algorithm from MPI. The fast baseline multiplier is optimized for cases where the number of input digits $N$ is less
|
||||
than or equal to $2^{w}/\beta^2$. Where $w$ is the number of bits in a \textbf{mp\_word}. By default a mp\_word is
|
||||
64-bits which means $N \le 256$ is allowed which represents numbers upto $7168$ bits.
|
||||
of the baseline multiplier. The normal and the fast comba variant. The normal baseline multiplier is the exact same as
|
||||
the algorithm from MPI. The fast comba baseline multiplier is optimized for cases where the number of input digits $N$
|
||||
is less than or equal to $2^{w}/\beta^2$. Where $w$ is the number of bits in a \textbf{mp\_word} or simply $lg(\beta)$.
|
||||
By default a mp\_word is 64-bits which means $N \le 256$ is allowed which represents numbers upto $7,168$ bits. However,
|
||||
since the Karatsuba multiplier (discussed below) will kick in before that size the slower baseline algorithm (that MPI
|
||||
uses) should never really be used in a default configuration.
|
||||
|
||||
The fast baseline multiplier is optimized by removing the carry operations from the inner loop. This is often referred
|
||||
to as the ``comba'' method since it computes the products a columns first then figures out the carries. This has the
|
||||
effect of making a very simple and paralizable inner loop.
|
||||
The fast comba baseline multiplier is optimized by removing the carry operations from the inner loop. This is often
|
||||
referred to as the ``comba'' method since it computes the products a columns first then figures out the carries. To
|
||||
accomodate this the result of the inner multiplications must be stored in words large enough not to lose the carry bits.
|
||||
This is why there is a limit of $2^{w}/\beta^2$ digits in the input. This optimization has the effect of making a
|
||||
very simple and efficient inner loop.
|
||||
|
||||
For large inputs, typically 80 digits\footnote{By default that is 2240-bits or more.} or more the Karatsuba method is
|
||||
used. This method has significant overhead but an asymptotic running time of $O(n^{1.584})$ which means for fairly large
|
||||
inputs this method is faster. The Karatsuba implementation is recursive which means for extremely large inputs they
|
||||
will benefit from the algorithm.
|
||||
\subsubsection{Karatsuba Multiplier}
|
||||
For large inputs, typically 80 digits\footnote{By default that is 2240-bits or more.} or more the Karatsuba multiplication
|
||||
method is used. This method has significant overhead but an asymptotic running time of $O(n^{1.584})$ which means for
|
||||
fairly large inputs this method is faster than the baseline (or comba) algorithm. The Karatsuba implementation is
|
||||
recursive which means for extremely large inputs they will benefit from the algorithm.
|
||||
|
||||
The algorithm is based on the observation that if
|
||||
|
||||
\begin{eqnarray}
|
||||
x = x_0 + x_1\beta \nonumber \\
|
||||
y = y_0 + y_1\beta
|
||||
\end{eqnarray}
|
||||
|
||||
Where $x_0, x_1, y_0, y_1$ are half the size of their respective summand than
|
||||
|
||||
\begin{equation}
|
||||
x \cdot y = x_1y_1\beta^2 + ((x_1 - y_1)(x_0 - y_0) + x_0y_0 + x_1y_1)\beta + x_0y_0
|
||||
\end{equation}
|
||||
|
||||
It is trivial that from this only three products have to be produced: $x_0y_0, x_1y_1, (x_1-y_1)(x_0-y_0)$ which
|
||||
are all of half size numbers. A multiplication of two half size numbers requires only $1 \over 4$ of the
|
||||
original work which means with no recursion the Karatsuba algorithm achieves a running time of ${3n^2}\over 4$.
|
||||
The routine provided does recursion which is where the $O(n^{1.584})$ work factor comes from.
|
||||
|
||||
The multiplication by $\beta$ and $\beta^2$ amount to digit shift operations.
|
||||
The extra overhead in the Karatsuba method comes from extracting the half size numbers $x_0, x_1, y_0, y_1$ and
|
||||
performing the various smaller calculations.
|
||||
|
||||
The library has been fairly optimized to extract the digits using hard-coded routines instead of the hire
|
||||
level functions however there is still significant overhead to optimize away.
|
||||
|
||||
MPI only implements the slower baseline multiplier where carries are dealt with in the inner loop. As a result even at
|
||||
smaller numbers (below the Karatsuba cutoff) the LibTomMath multipliers are faster.
|
||||
|
||||
\subsection{Squaring Algorithms}
|
||||
|
||||
Similar to the multiplication algorithms there are two baseline squaring algorithms. Both have an asymptotic running
|
||||
time of $O((t^2 + t)/2)$. The normal baseline squaring is the same from MPI and the fast is a ``comba'' squaring
|
||||
algorithm. The comba method is used if the number of digits $N$ is less than $2^{w-1}/\beta^2$ which by default
|
||||
covers numbers upto $3584$ bits.
|
||||
Similar to the multiplication algorithms there are two baseline squaring algorithms. Both have an asymptotic
|
||||
running time of $O((t^2 + t)/2)$. The normal baseline squaring is the same from MPI and the fast method is
|
||||
a ``comba'' squaring algorithm. The comba method is used if the number of digits $N$ is less than
|
||||
$2^{w-1}/\beta^2$ which by default covers numbers upto $3,584$ bits.
|
||||
|
||||
There is also a Karatsuba squaring method which achieves a running time of $O(n^{1.584})$ after considerably large
|
||||
inputs.
|
||||
@ -653,25 +652,31 @@ than MPI is.
|
||||
|
||||
LibTomMath implements a sliding window $k$-ary left to right exponentiation algorithm. For a given exponent size $L$ an
|
||||
appropriate window size $k$ is chosen. There are always at most $L$ modular squarings and $\lfloor L/k \rfloor$ modular
|
||||
multiplications. The $k$-ary method works by precomputing values $g(x) = b^x$ for $0 \le x < 2^k$ and a given base
|
||||
multiplications. The $k$-ary method works by precomputing values $g(x) = b^x$ for $2^{k-1} \le x < 2^k$ and a given base
|
||||
$b$. Then the multiplications are grouped in windows of $k$ bits. The sliding window technique has the benefit
|
||||
that it can skip multiplications if there are zero bits following or preceding a window. Consider the exponent
|
||||
$e = 11110001_2$ if $k = 2$ then there will be a two squarings, a multiplication of $g(3)$, two squarings, a multiplication
|
||||
of $g(3)$, four squarings and and a multiplication by $g(1)$. In total there are 8 squarings and 3 multiplications.
|
||||
of $g(3)$, four squarings and and a multiplication by $g(1)$. In total there are 8 squarings and 3 multiplications.
|
||||
|
||||
MPI uses a binary square-multiply method. For the same exponent $e$ it would have had 8 squarings and 5 multiplications.
|
||||
There is a precomputation phase for the method LibTomMath uses but it generally cuts down considerably on the number
|
||||
of multiplications. Consider a 512-bit exponent. The worst case for the LibTomMath method results in 512 squarings and
|
||||
124 multiplications. The MPI method would have 512 squarings and 512 multiplications. Randomly every $2k$ bits another
|
||||
multiplication is saved via the sliding-window technique on top of the savings the $k$-ary method provides.
|
||||
MPI uses a binary square-multiply method for exponentiation. For the same exponent $e = 11110001_2$ it would have had to
|
||||
perform 8 squarings and 5 multiplications. There is a precomputation phase for the method LibTomMath uses but it
|
||||
generally cuts down considerably on the number of multiplications. Consider a 512-bit exponent. The worst case for the
|
||||
LibTomMath method results in 512 squarings and 124 multiplications. The MPI method would have 512 squarings
|
||||
and 512 multiplications. Randomly every $2k$ bits another multiplication is saved via the sliding-window
|
||||
technique on top of the savings the $k$-ary method provides.
|
||||
|
||||
Both LibTomMath and MPI use Barrett reduction instead of division to reduce the numbers modulo the modulus given.
|
||||
However, LibTomMath can take advantage of the fact that the multiplications required within the Barrett reduction
|
||||
do not have to give full precision. As a result the reduction step is much faster and just as accurate. The LibTomMath code
|
||||
will automatically determine at run-time (e.g. when its called) whether the faster multiplier can be used. The
|
||||
do not have to give full precision. As a result the reduction step is much faster and just as accurate. The LibTomMath
|
||||
code will automatically determine at run-time (e.g. when its called) whether the faster multiplier can be used. The
|
||||
faster multipliers have also been optimized into the two variants (baseline and comba baseline).
|
||||
|
||||
LibTomMath also has a variant of the exptmod function that uses Montgomery reductions instead of Barrett reductions
|
||||
which is faser. As a result of all these changes exponentiation in LibTomMath is much faster than compared to MPI.
|
||||
which is faster. The code will automatically detect when the Montgomery version can be used (\textit{Requires the
|
||||
modulus to be odd and below the MONTGOMERY\_EXPT\_CUTOFF size}). The Montgomery routine is essentially a copy of the
|
||||
Barrett exponentiation routine except it uses Montgomery reduction.
|
||||
|
||||
As a result of all these changes exponentiation in LibTomMath is much faster than compared to MPI. On most ALU-strong
|
||||
processors (AMD Athlon for instance) exponentiation in LibTomMath is often more then ten times faster than MPI.
|
||||
|
||||
\end{document}
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -100,14 +100,18 @@ fast_mp_montgomery_reduce (mp_int * a, mp_int * m, mp_digit mp)
|
||||
W[ix + 1] += W[ix] >> ((mp_word) DIGIT_BIT);
|
||||
}
|
||||
|
||||
/* nox fix rest of carries */
|
||||
for (++ix; ix <= m->used * 2 + 1; ix++) {
|
||||
W[ix] += (W[ix - 1] >> ((mp_word) DIGIT_BIT));
|
||||
}
|
||||
|
||||
{
|
||||
register mp_digit *tmpa;
|
||||
register mp_word *_W;
|
||||
register mp_word *_W, *_W1;
|
||||
|
||||
/* nox fix rest of carries */
|
||||
_W1 = W + ix;
|
||||
_W = W + ++ix;
|
||||
|
||||
for (; ix <= m->used * 2 + 1; ix++) {
|
||||
*_W++ += *_W1++ >> ((mp_word) DIGIT_BIT);
|
||||
}
|
||||
|
||||
/* copy out, A = A/b^n
|
||||
*
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -46,6 +46,7 @@ mp_div_2 (mp_int * a, mp_int * b)
|
||||
*tmpb++ = 0;
|
||||
}
|
||||
}
|
||||
b->sign = a->sign;
|
||||
mp_clamp (b);
|
||||
return MP_OKAY;
|
||||
}
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -51,7 +51,9 @@ mp_div_2d (mp_int * a, int b, mp_int * c, mp_int * d)
|
||||
}
|
||||
|
||||
/* shift by as many digits in the bit count */
|
||||
mp_rshd (c, b / DIGIT_BIT);
|
||||
if (b >= DIGIT_BIT) {
|
||||
mp_rshd (c, b / DIGIT_BIT);
|
||||
}
|
||||
|
||||
/* shift any bit count < DIGIT_BIT */
|
||||
D = (mp_digit) (b % DIGIT_BIT);
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -37,8 +37,7 @@ int
|
||||
mp_karatsuba_mul (mp_int * a, mp_int * b, mp_int * c)
|
||||
{
|
||||
mp_int x0, x1, y0, y1, t1, t2, x0y0, x1y1;
|
||||
int B, err, x;
|
||||
|
||||
int B, err;
|
||||
|
||||
err = MP_MEM;
|
||||
|
||||
@ -59,13 +58,13 @@ mp_karatsuba_mul (mp_int * a, mp_int * b, mp_int * c)
|
||||
goto Y0;
|
||||
|
||||
/* init temps */
|
||||
if (mp_init (&t1) != MP_OKAY)
|
||||
if (mp_init_size (&t1, B * 2) != MP_OKAY)
|
||||
goto Y1;
|
||||
if (mp_init (&t2) != MP_OKAY)
|
||||
if (mp_init_size (&t2, B * 2) != MP_OKAY)
|
||||
goto T1;
|
||||
if (mp_init (&x0y0) != MP_OKAY)
|
||||
if (mp_init_size (&x0y0, B * 2) != MP_OKAY)
|
||||
goto T2;
|
||||
if (mp_init (&x1y1) != MP_OKAY)
|
||||
if (mp_init_size (&x1y1, B * 2) != MP_OKAY)
|
||||
goto X0Y0;
|
||||
|
||||
/* now shift the digits */
|
||||
@ -76,18 +75,32 @@ mp_karatsuba_mul (mp_int * a, mp_int * b, mp_int * c)
|
||||
x1.used = a->used - B;
|
||||
y1.used = b->used - B;
|
||||
|
||||
/* we copy the digits directly instead of using higher level functions
|
||||
* since we also need to shift the digits
|
||||
*/
|
||||
for (x = 0; x < B; x++) {
|
||||
x0.dp[x] = a->dp[x];
|
||||
y0.dp[x] = b->dp[x];
|
||||
}
|
||||
for (x = B; x < a->used; x++) {
|
||||
x1.dp[x - B] = a->dp[x];
|
||||
}
|
||||
for (x = B; x < b->used; x++) {
|
||||
y1.dp[x - B] = b->dp[x];
|
||||
{
|
||||
register int x;
|
||||
register mp_digit *tmpa, *tmpb, *tmpx, *tmpy;
|
||||
|
||||
/* we copy the digits directly instead of using higher level functions
|
||||
* since we also need to shift the digits
|
||||
*/
|
||||
tmpa = a->dp;
|
||||
tmpb = b->dp;
|
||||
|
||||
tmpx = x0.dp;
|
||||
tmpy = y0.dp;
|
||||
for (x = 0; x < B; x++) {
|
||||
*tmpx++ = *tmpa++;
|
||||
*tmpy++ = *tmpb++;
|
||||
}
|
||||
|
||||
tmpx = x1.dp;
|
||||
for (x = B; x < a->used; x++) {
|
||||
*tmpx++ = *tmpa++;
|
||||
}
|
||||
|
||||
tmpy = y1.dp;
|
||||
for (x = B; x < b->used; x++) {
|
||||
*tmpy++ = *tmpb++;
|
||||
}
|
||||
}
|
||||
|
||||
/* only need to clamp the lower words since by definition the upper words x1/y1 must
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -23,8 +23,7 @@ int
|
||||
mp_karatsuba_sqr (mp_int * a, mp_int * b)
|
||||
{
|
||||
mp_int x0, x1, t1, t2, x0x0, x1x1;
|
||||
int B, err, x;
|
||||
|
||||
int B, err;
|
||||
|
||||
err = MP_MEM;
|
||||
|
||||
@ -41,22 +40,31 @@ mp_karatsuba_sqr (mp_int * a, mp_int * b)
|
||||
goto X0;
|
||||
|
||||
/* init temps */
|
||||
if (mp_init (&t1) != MP_OKAY)
|
||||
if (mp_init_size (&t1, a->used * 2) != MP_OKAY)
|
||||
goto X1;
|
||||
if (mp_init (&t2) != MP_OKAY)
|
||||
if (mp_init_size (&t2, a->used * 2) != MP_OKAY)
|
||||
goto T1;
|
||||
if (mp_init (&x0x0) != MP_OKAY)
|
||||
if (mp_init_size (&x0x0, B * 2) != MP_OKAY)
|
||||
goto T2;
|
||||
if (mp_init (&x1x1) != MP_OKAY)
|
||||
if (mp_init_size (&x1x1, (a->used - B) * 2) != MP_OKAY)
|
||||
goto X0X0;
|
||||
|
||||
/* now shift the digits */
|
||||
for (x = 0; x < B; x++) {
|
||||
x0.dp[x] = a->dp[x];
|
||||
}
|
||||
{
|
||||
register int x;
|
||||
register mp_digit *dst, *src;
|
||||
|
||||
for (x = B; x < a->used; x++) {
|
||||
x1.dp[x - B] = a->dp[x];
|
||||
src = a->dp;
|
||||
|
||||
/* now shift the digits */
|
||||
dst = x0.dp;
|
||||
for (x = 0; x < B; x++) {
|
||||
*dst++ = *src++;
|
||||
}
|
||||
|
||||
dst = x1.dp;
|
||||
for (x = B; x < a->used; x++) {
|
||||
*dst++ = *src++;
|
||||
}
|
||||
}
|
||||
|
||||
x0.used = B;
|
||||
@ -77,7 +85,7 @@ mp_karatsuba_sqr (mp_int * a, mp_int * b)
|
||||
goto X1X1; /* t1 = (x1 - x0) * (y1 - y0) */
|
||||
|
||||
/* add x0y0 */
|
||||
if (mp_add (&x0x0, &x1x1, &t2) != MP_OKAY)
|
||||
if (s_mp_add (&x0x0, &x1x1, &t2) != MP_OKAY)
|
||||
goto X1X1; /* t2 = x0y0 + x1y1 */
|
||||
if (mp_sub (&t2, &t1, &t1) != MP_OKAY)
|
||||
goto X1X1; /* t1 = x0y0 + x1y1 - (x1-x0)*(y1-y0) */
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
35
bn_mp_lshd.c
35
bn_mp_lshd.c
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -31,16 +31,31 @@ mp_lshd (mp_int * a, int b)
|
||||
return res;
|
||||
}
|
||||
|
||||
/* increment the used by the shift amount than copy upwards */
|
||||
a->used += b;
|
||||
for (x = a->used - 1; x >= b; x--) {
|
||||
a->dp[x] = a->dp[x - b];
|
||||
}
|
||||
{
|
||||
register mp_digit *tmpa, *tmpaa;
|
||||
|
||||
/* zero the lower digits */
|
||||
for (x = 0; x < b; x++) {
|
||||
a->dp[x] = 0;
|
||||
/* increment the used by the shift amount than copy upwards */
|
||||
a->used += b;
|
||||
|
||||
/* top */
|
||||
tmpa = a->dp + a->used - 1;
|
||||
|
||||
/* base */
|
||||
tmpaa = a->dp + a->used - 1 - b;
|
||||
|
||||
/* much like mp_rshd this is implemented using a sliding window
|
||||
* except the window goes the otherway around. Copying from
|
||||
* the bottom to the top. see bn_mp_rshd.c for more info.
|
||||
*/
|
||||
for (x = a->used - 1; x >= b; x--) {
|
||||
*tmpa-- = *tmpaa--;
|
||||
}
|
||||
|
||||
/* zero the lower digits */
|
||||
tmpa = a->dp;
|
||||
for (x = 0; x < b; x++) {
|
||||
*tmpa++ = 0;
|
||||
}
|
||||
}
|
||||
mp_clamp (a);
|
||||
return MP_OKAY;
|
||||
}
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -18,36 +18,29 @@
|
||||
int
|
||||
mp_montgomery_setup (mp_int * a, mp_digit * mp)
|
||||
{
|
||||
mp_int t, tt;
|
||||
int res;
|
||||
unsigned long x, b;
|
||||
|
||||
if ((res = mp_init (&t)) != MP_OKAY) {
|
||||
return res;
|
||||
/* fast inversion mod 2^32
|
||||
*
|
||||
* Based on the fact that
|
||||
*
|
||||
* XA = 1 (mod 2^n) => (X(2-XA)) A = 1 (mod 2^2n)
|
||||
* => 2*X*A - X*X*A*A = 1
|
||||
* => 2*(1) - (1) = 1
|
||||
*/
|
||||
b = a->dp[0];
|
||||
|
||||
if ((b & 1) == 0) {
|
||||
return MP_VAL;
|
||||
}
|
||||
|
||||
if ((res = mp_init (&tt)) != MP_OKAY) {
|
||||
goto __T;
|
||||
}
|
||||
|
||||
/* tt = b */
|
||||
tt.dp[0] = 0;
|
||||
tt.dp[1] = 1;
|
||||
tt.used = 2;
|
||||
|
||||
/* t = m mod b */
|
||||
t.dp[0] = a->dp[0];
|
||||
t.used = 1;
|
||||
|
||||
/* t = 1/m mod b */
|
||||
if ((res = mp_invmod (&t, &tt, &t)) != MP_OKAY) {
|
||||
goto __TT;
|
||||
}
|
||||
x = (((b + 2) & 4) << 1) + b; /* here x*a==1 mod 2^4 */
|
||||
x *= 2 - b * x; /* here x*a==1 mod 2^8 */
|
||||
x *= 2 - b * x; /* here x*a==1 mod 2^16; each step doubles the nb of bits */
|
||||
x *= 2 - b * x; /* here x*a==1 mod 2^32 */
|
||||
|
||||
/* t = -1/m mod b */
|
||||
*mp = ((mp_digit) 1 << ((mp_digit) DIGIT_BIT)) - t.dp[0];
|
||||
*mp = ((mp_digit) 1 << ((mp_digit) DIGIT_BIT)) - (x & MP_MASK);
|
||||
|
||||
res = MP_OKAY;
|
||||
__TT:mp_clear (&tt);
|
||||
__T:mp_clear (&t);
|
||||
return res;
|
||||
return MP_OKAY;
|
||||
}
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -50,6 +50,11 @@ mp_mul_2 (mp_int * a, mp_int * b)
|
||||
if ((res = mp_grow (b, b->used + 1)) != MP_OKAY) {
|
||||
return res;
|
||||
}
|
||||
|
||||
/* after the grow *tmpb is no longer valid so we have to reset it!
|
||||
* (this bug took me about 17 minutes to find...!)
|
||||
*/
|
||||
tmpb = b->dp + b->used;
|
||||
}
|
||||
/* add a MSB of 1 */
|
||||
*tmpb = 1;
|
||||
@ -61,5 +66,6 @@ mp_mul_2 (mp_int * a, mp_int * b)
|
||||
*tmpb++ = 0;
|
||||
}
|
||||
}
|
||||
b->sign = a->sign;
|
||||
return MP_OKAY;
|
||||
}
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -32,9 +32,11 @@ mp_mul_2d (mp_int * a, int b, mp_int * c)
|
||||
}
|
||||
|
||||
/* shift by as many digits in the bit count */
|
||||
if ((res = mp_lshd (c, b / DIGIT_BIT)) != MP_OKAY) {
|
||||
return res;
|
||||
}
|
||||
if (b >= DIGIT_BIT) {
|
||||
if ((res = mp_lshd (c, b / DIGIT_BIT)) != MP_OKAY) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
c->used = c->alloc;
|
||||
|
||||
/* shift any bit count < DIGIT_BIT */
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
37
bn_mp_rshd.c
37
bn_mp_rshd.c
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -20,7 +20,6 @@ mp_rshd (mp_int * a, int b)
|
||||
{
|
||||
int x;
|
||||
|
||||
|
||||
/* if b <= 0 then ignore it */
|
||||
if (b <= 0) {
|
||||
return;
|
||||
@ -32,14 +31,34 @@ mp_rshd (mp_int * a, int b)
|
||||
return;
|
||||
}
|
||||
|
||||
/* shift the digits down */
|
||||
for (x = 0; x < (a->used - b); x++) {
|
||||
a->dp[x] = a->dp[x + b];
|
||||
}
|
||||
{
|
||||
register mp_digit *tmpa, *tmpaa;
|
||||
|
||||
/* zero the top digits */
|
||||
for (; x < a->used; x++) {
|
||||
a->dp[x] = 0;
|
||||
/* shift the digits down */
|
||||
|
||||
/* base */
|
||||
tmpa = a->dp;
|
||||
|
||||
/* offset into digits */
|
||||
tmpaa = a->dp + b;
|
||||
|
||||
/* this is implemented as a sliding window where the window is b-digits long
|
||||
* and digits from the top of the window are copied to the bottom
|
||||
*
|
||||
* e.g.
|
||||
|
||||
b-2 | b-1 | b0 | b1 | b2 | ... | bb | ---->
|
||||
/\ | ---->
|
||||
\-------------------/ ---->
|
||||
*/
|
||||
for (x = 0; x < (a->used - b); x++) {
|
||||
*tmpa++ = *tmpaa++;
|
||||
}
|
||||
|
||||
/* zero the top digits */
|
||||
for (; x < a->used; x++) {
|
||||
*tmpa++ = 0;
|
||||
}
|
||||
}
|
||||
mp_clamp (a);
|
||||
}
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
@ -55,8 +55,14 @@ s_mp_add (mp_int * a, mp_int * b, mp_int * c)
|
||||
register int i;
|
||||
|
||||
/* alias for digit pointers */
|
||||
|
||||
/* first input */
|
||||
tmpa = a->dp;
|
||||
|
||||
/* second input */
|
||||
tmpb = b->dp;
|
||||
|
||||
/* destination */
|
||||
tmpc = c->dp;
|
||||
|
||||
u = 0;
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
|
11
bncore.c
11
bncore.c
@ -10,10 +10,13 @@
|
||||
* The library is free for all purposes without any express
|
||||
* guarantee it works.
|
||||
*
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://libtommath.iahu.ca
|
||||
* Tom St Denis, tomstdenis@iahu.ca, http://math.libtomcrypt.org
|
||||
*/
|
||||
#include <tommath.h>
|
||||
|
||||
int KARATSUBA_MUL_CUTOFF = 80, /* Min. number of digits before Karatsuba multiplication is used. */
|
||||
KARATSUBA_SQR_CUTOFF = 80, /* Min. number of digits before Karatsuba squaring is used. */
|
||||
MONTGOMERY_EXPT_CUTOFF = 74; /* max. number of digits that montgomery reductions will help for */
|
||||
/* configured for a AMD Duron Morgan core with etc/tune.c */
|
||||
int KARATSUBA_MUL_CUTOFF = 73, /* Min. number of digits before Karatsuba multiplication is used. */
|
||||
KARATSUBA_SQR_CUTOFF = 121, /* Min. number of digits before Karatsuba squaring is used. */
|
||||
MONTGOMERY_EXPT_CUTOFF = 128; /* max. number of digits that montgomery reductions will help for */
|
||||
|
||||
|
||||
|
13
changes.txt
13
changes.txt
@ -1,3 +1,16 @@
|
||||
Mar 15th, 2003
|
||||
v0.14 -- Tons of manual updates
|
||||
-- cleaned up the directory
|
||||
-- added MSVC makefiles
|
||||
-- source changes [that I don't recall]
|
||||
-- Fixed up the lshd/rshd code to use pointer aliasing
|
||||
-- Fixed up the mul_2d and div_2d to not call rshd/lshd unless needed
|
||||
-- Fixed up etc/tune.c a tad
|
||||
-- fixed up demo/demo.c to output comma-delimited results of timing
|
||||
also fixed up timing demo to use a finer granularity for various functions
|
||||
-- fixed up demo/demo.c testing to pause during testing so my Duron won't catch on fire
|
||||
[stays around 31-35C during testing :-)]
|
||||
|
||||
Feb 13th, 2003
|
||||
v0.13 -- tons of minor speed-ups in low level add, sub, mul_2 and div_2 which propagate
|
||||
to other functions like mp_invmod, mp_div, etc...
|
||||
|
116
demo/demo.c
116
demo/demo.c
@ -69,18 +69,32 @@ int mp_reduce_setup(mp_int *a, mp_int *b)
|
||||
}
|
||||
return mp_div(a, b, a, NULL);
|
||||
}
|
||||
|
||||
int mp_rand(mp_int *a, int c)
|
||||
{
|
||||
long z = abs(rand()) & 65535;
|
||||
mp_set(a, z?z:1);
|
||||
while (c--) {
|
||||
s_mp_lshd(a, 1);
|
||||
mp_add_d(a, abs(rand()), a);
|
||||
}
|
||||
return MP_OKAY;
|
||||
}
|
||||
#endif
|
||||
|
||||
char cmd[4096], buf[4096];
|
||||
int main(void)
|
||||
{
|
||||
mp_int a, b, c, d, e, f;
|
||||
unsigned long expt_n, add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, inv_n;
|
||||
unsigned long expt_n, add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, inv_n,
|
||||
div2_n, mul2_n;
|
||||
unsigned rr;
|
||||
int cnt;
|
||||
|
||||
#ifdef TIMER
|
||||
int n;
|
||||
ulong64 tt;
|
||||
FILE *log;
|
||||
#endif
|
||||
|
||||
mp_init(&a);
|
||||
@ -90,60 +104,66 @@ int main(void)
|
||||
mp_init(&e);
|
||||
mp_init(&f);
|
||||
|
||||
|
||||
#ifdef TIMER
|
||||
goto multtime;
|
||||
|
||||
printf("CLOCKS_PER_SEC == %lu\n", CLOCKS_PER_SEC);
|
||||
mp_read_radix(&a, "340282366920938463463374607431768211455", 10);
|
||||
mp_read_radix(&b, "340282366920938463463574607431768211455", 10);
|
||||
while (a.used * DIGIT_BIT < 8192) {
|
||||
goto expttime;
|
||||
|
||||
log = fopen("add.log", "w");
|
||||
for (cnt = 4; cnt <= 128; cnt += 4) {
|
||||
mp_rand(&a, cnt);
|
||||
mp_rand(&b, cnt);
|
||||
reset();
|
||||
for (rr = 0; rr < 10000000; rr++) {
|
||||
mp_add(&a, &b, &c);
|
||||
}
|
||||
tt = rdtsc();
|
||||
printf("Adding\t\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt);
|
||||
mp_sqr(&a, &a);
|
||||
mp_sqr(&b, &b);
|
||||
fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt);
|
||||
}
|
||||
fclose(log);
|
||||
|
||||
mp_read_radix(&a, "340282366920938463463374607431768211455", 10);
|
||||
mp_read_radix(&b, "340282366920938463463574607431768211455", 10);
|
||||
while (a.used * DIGIT_BIT < 8192) {
|
||||
log = fopen("sub.log", "w");
|
||||
for (cnt = 4; cnt <= 128; cnt += 4) {
|
||||
mp_rand(&a, cnt);
|
||||
mp_rand(&b, cnt);
|
||||
reset();
|
||||
for (rr = 0; rr < 10000000; rr++) {
|
||||
mp_sub(&a, &b, &c);
|
||||
}
|
||||
tt = rdtsc();
|
||||
printf("Subtracting\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt);
|
||||
mp_sqr(&a, &a);
|
||||
mp_sqr(&b, &b);
|
||||
printf("Subtracting\t\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt);
|
||||
fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt);
|
||||
}
|
||||
fclose(log);
|
||||
|
||||
multtime:
|
||||
|
||||
mp_read_radix(&a, "340282366920938463463374607431768211455", 10);
|
||||
while (a.used * DIGIT_BIT < 8192) {
|
||||
log = fopen("sqr.log", "w");
|
||||
for (cnt = 4; cnt <= 128; cnt += 4) {
|
||||
mp_rand(&a, cnt);
|
||||
reset();
|
||||
for (rr = 0; rr < 250000; rr++) {
|
||||
mp_sqr(&a, &b);
|
||||
}
|
||||
tt = rdtsc();
|
||||
printf("Squaring\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt);
|
||||
mp_copy(&b, &a);
|
||||
fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt);
|
||||
}
|
||||
fclose(log);
|
||||
|
||||
mp_read_radix(&a, "340282366920938463463374607431768211455", 10);
|
||||
while (a.used * DIGIT_BIT < 8192) {
|
||||
log = fopen("mult.log", "w");
|
||||
for (cnt = 4; cnt <= 128; cnt += 4) {
|
||||
mp_rand(&a, cnt);
|
||||
mp_rand(&b, cnt);
|
||||
reset();
|
||||
for (rr = 0; rr < 250000; rr++) {
|
||||
mp_mul(&a, &a, &b);
|
||||
mp_mul(&a, &b, &c);
|
||||
}
|
||||
tt = rdtsc();
|
||||
printf("Multiplying\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt);
|
||||
mp_copy(&b, &a);
|
||||
fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt);
|
||||
}
|
||||
fclose(log);
|
||||
|
||||
expttime:
|
||||
{
|
||||
@ -157,6 +177,7 @@ expttime:
|
||||
"1214855636816562637502584060163403830270705000634713483015101384881871978446801224798536155406895823305035467591632531067547890948695117172076954220727075688048751022421198712032848890056357845974246560748347918630050853933697792254955890439720297560693579400297062396904306270145886830719309296352765295712183040773146419022875165382778007040109957609739589875590885701126197906063620133954893216612678838507540777138437797705602453719559017633986486649523611975865005712371194067612263330335590526176087004421363598470302731349138773205901447704682181517904064735636518462452242791676541725292378925568296858010151852326316777511935037531017413910506921922450666933202278489024521263798482237150056835746454842662048692127173834433089016107854491097456725016327709663199738238442164843147132789153725513257167915555162094970853584447993125488607696008169807374736711297007473812256272245489405898470297178738029484459690836250560495461579533254473316340608217876781986188705928270735695752830825527963838355419762516246028680280988020401914551825487349990306976304093109384451438813251211051597392127491464898797406789175453067960072008590614886532333015881171367104445044718144312416815712216611576221546455968770801413440778423979",
|
||||
NULL
|
||||
};
|
||||
log = fopen("expt.log", "w");
|
||||
for (n = 0; primes[n]; n++) {
|
||||
mp_read_radix(&a, primes[n], 10);
|
||||
mp_zero(&b);
|
||||
@ -183,12 +204,21 @@ expttime:
|
||||
exit(0);
|
||||
}
|
||||
printf("Exponentiating\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt);
|
||||
fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt);
|
||||
}
|
||||
}
|
||||
|
||||
mp_read_radix(&a, "340282366920938463463374607431768211455", 10);
|
||||
mp_read_radix(&b, "234892374891378913789237289378973232333", 10);
|
||||
while (a.used * DIGIT_BIT < 8192) {
|
||||
fclose(log);
|
||||
invtime:
|
||||
log = fopen("invmod.log", "w");
|
||||
for (cnt = 4; cnt <= 128; cnt += 4) {
|
||||
mp_rand(&a, cnt);
|
||||
mp_rand(&b, cnt);
|
||||
|
||||
do {
|
||||
mp_add_d(&b, 1, &b);
|
||||
mp_gcd(&a, &b, &c);
|
||||
} while (mp_cmp_d(&c, 1) != MP_EQ);
|
||||
|
||||
reset();
|
||||
for (rr = 0; rr < 10000; rr++) {
|
||||
mp_invmod(&b, &a, &c);
|
||||
@ -200,16 +230,18 @@ expttime:
|
||||
return 0;
|
||||
}
|
||||
printf("Inverting mod\t%4d-bit => %9llu/sec, %9llu ticks\n", mp_count_bits(&a), (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt, tt);
|
||||
mp_sqr(&a, &a);
|
||||
mp_sqr(&b, &b);
|
||||
fprintf(log, "%d,%9llu\n", cnt, (((unsigned long long)rr)*CLOCKS_PER_SEC)/tt);
|
||||
}
|
||||
fclose(log);
|
||||
|
||||
return 0;
|
||||
|
||||
#endif
|
||||
|
||||
inv_n = expt_n = lcm_n = gcd_n = add_n = sub_n = mul_n = div_n = sqr_n = mul2d_n = div2d_n = 0;
|
||||
div2_n = mul2_n = inv_n = expt_n = lcm_n = gcd_n = add_n =
|
||||
sub_n = mul_n = div_n = sqr_n = mul2d_n = div2d_n = cnt = 0;
|
||||
for (;;) {
|
||||
if (!(++cnt & 15)) sleep(3);
|
||||
|
||||
/* randomly clear and re-init one variable, this has the affect of triming the alloc space */
|
||||
switch (abs(rand()) % 7) {
|
||||
@ -223,7 +255,7 @@ expttime:
|
||||
}
|
||||
|
||||
|
||||
printf("%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%5d\r", add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, expt_n, inv_n, _ifuncs);
|
||||
printf("%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu/%7lu ", add_n, sub_n, mul_n, div_n, sqr_n, mul2d_n, div2d_n, gcd_n, lcm_n, expt_n, inv_n, div2_n, mul2_n);
|
||||
fgets(cmd, 4095, stdin);
|
||||
cmd[strlen(cmd)-1] = 0;
|
||||
printf("%s ]\r",cmd); fflush(stdout);
|
||||
@ -386,7 +418,29 @@ draw(&a);draw(&b);draw(&c);draw(&d);
|
||||
return 0;
|
||||
}
|
||||
|
||||
}
|
||||
} else if (!strcmp(cmd, "div2")) { ++div2_n;
|
||||
fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 10);
|
||||
fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 10);
|
||||
mp_div_2(&a, &c);
|
||||
if (mp_cmp(&c, &b) != MP_EQ) {
|
||||
printf("div_2 %lu failure\n", div2_n);
|
||||
draw(&a);
|
||||
draw(&b);
|
||||
draw(&c);
|
||||
return 0;
|
||||
}
|
||||
} else if (!strcmp(cmd, "mul2")) { ++mul2_n;
|
||||
fgets(buf, 4095, stdin); mp_read_radix(&a, buf, 10);
|
||||
fgets(buf, 4095, stdin); mp_read_radix(&b, buf, 10);
|
||||
mp_mul_2(&a, &c);
|
||||
if (mp_cmp(&c, &b) != MP_EQ) {
|
||||
printf("mul_2 %lu failure\n", mul2_n);
|
||||
draw(&a);
|
||||
draw(&b);
|
||||
draw(&c);
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
return 0;
|
||||
|
@ -17,4 +17,4 @@ mersenne: mersenne.o
|
||||
$(CC) mersenne.o $(LIBNAME) -o mersenne
|
||||
|
||||
clean:
|
||||
rm -f *.o *.exe pprime tune mersenne
|
||||
rm -f *.log *.o *.obj *.exe pprime tune mersenne
|
14
etc/makefile.msvc
Normal file
14
etc/makefile.msvc
Normal file
@ -0,0 +1,14 @@
|
||||
#MSVC Makefile
|
||||
#
|
||||
#Tom St Denis
|
||||
|
||||
CFLAGS = /I../ /Ogityb2 /Gs /DWIN32 /W3
|
||||
|
||||
pprime: pprime.obj
|
||||
cl pprime.obj ../tommath.lib
|
||||
|
||||
mersenne: mersenne.obj
|
||||
cl mersenne.obj ../tommath.lib
|
||||
|
||||
tune: tune.obj
|
||||
cl tune.obj ../tommath.lib
|
@ -3,14 +3,14 @@
|
||||
* Tom St Denis, tomstdenis@iahu.ca
|
||||
*/
|
||||
#include <time.h>
|
||||
#include <bn.h>
|
||||
#include <tommath.h>
|
||||
|
||||
int
|
||||
is_mersenne (long s, int *pp)
|
||||
{
|
||||
mp_int n, u, mu;
|
||||
int res, k;
|
||||
long ss;
|
||||
mp_int n, u, mu;
|
||||
int res, k;
|
||||
long ss;
|
||||
|
||||
*pp = 0;
|
||||
|
||||
@ -85,7 +85,7 @@ __N:mp_clear (&n);
|
||||
long
|
||||
i_sqrt (long x)
|
||||
{
|
||||
long x1, x2;
|
||||
long x1, x2;
|
||||
|
||||
x2 = 16;
|
||||
do {
|
||||
@ -104,7 +104,7 @@ i_sqrt (long x)
|
||||
int
|
||||
isprime (long k)
|
||||
{
|
||||
long y, z;
|
||||
long y, z;
|
||||
|
||||
y = i_sqrt (k);
|
||||
for (z = 2; z <= y; z++) {
|
||||
@ -118,9 +118,9 @@ isprime (long k)
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
int pp;
|
||||
long k;
|
||||
clock_t tt;
|
||||
int pp;
|
||||
long k;
|
||||
clock_t tt;
|
||||
|
||||
k = 3;
|
||||
|
||||
|
20
etc/pprime.c
20
etc/pprime.c
@ -8,10 +8,10 @@
|
||||
#include "tommath.h"
|
||||
|
||||
/* fast square root */
|
||||
static mp_digit
|
||||
static mp_digit
|
||||
i_sqrt (mp_word x)
|
||||
{
|
||||
mp_word x1, x2;
|
||||
mp_word x1, x2;
|
||||
|
||||
x2 = x;
|
||||
do {
|
||||
@ -28,10 +28,10 @@ i_sqrt (mp_word x)
|
||||
|
||||
|
||||
/* generates a prime digit */
|
||||
static mp_digit
|
||||
static mp_digit
|
||||
prime_digit ()
|
||||
{
|
||||
mp_digit r, x, y, next;
|
||||
mp_digit r, x, y, next;
|
||||
|
||||
/* make a DIGIT_BIT-bit random number */
|
||||
for (r = x = 0; x < DIGIT_BIT; x++) {
|
||||
@ -141,8 +141,8 @@ prime_digit ()
|
||||
int
|
||||
pprime (int k, int li, mp_int * p, mp_int * q)
|
||||
{
|
||||
mp_int a, b, c, n, x, y, z, v;
|
||||
int res, ii;
|
||||
mp_int a, b, c, n, x, y, z, v;
|
||||
int res, ii;
|
||||
static const mp_digit bases[] = { 2, 3, 5, 7, 11, 13, 17, 19 };
|
||||
|
||||
/* single digit ? */
|
||||
@ -329,10 +329,10 @@ __C:mp_clear (&c);
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
mp_int p, q;
|
||||
char buf[4096];
|
||||
int k, li;
|
||||
clock_t t1;
|
||||
mp_int p, q;
|
||||
char buf[4096];
|
||||
int k, li;
|
||||
clock_t t1;
|
||||
|
||||
srand (time (NULL));
|
||||
|
||||
|
100
etc/tune.c
100
etc/tune.c
@ -8,19 +8,19 @@
|
||||
clock_t
|
||||
time_mult (void)
|
||||
{
|
||||
clock_t t1;
|
||||
int x, y;
|
||||
mp_int a, b, c;
|
||||
clock_t t1;
|
||||
int x, y;
|
||||
mp_int a, b, c;
|
||||
|
||||
mp_init (&a);
|
||||
mp_init (&b);
|
||||
mp_init (&c);
|
||||
|
||||
t1 = clock ();
|
||||
for (x = 8; x <= 128; x += 8) {
|
||||
for (y = 0; y < 1000; y++) {
|
||||
mp_rand (&a, x);
|
||||
mp_rand (&b, x);
|
||||
for (x = 4; x <= 128; x += 4) {
|
||||
mp_rand (&a, x);
|
||||
mp_rand (&b, x);
|
||||
for (y = 0; y < 10000; y++) {
|
||||
mp_mul (&a, &b, &c);
|
||||
}
|
||||
}
|
||||
@ -33,17 +33,17 @@ time_mult (void)
|
||||
clock_t
|
||||
time_sqr (void)
|
||||
{
|
||||
clock_t t1;
|
||||
int x, y;
|
||||
mp_int a, b;
|
||||
clock_t t1;
|
||||
int x, y;
|
||||
mp_int a, b;
|
||||
|
||||
mp_init (&a);
|
||||
mp_init (&b);
|
||||
|
||||
t1 = clock ();
|
||||
for (x = 8; x <= 128; x += 8) {
|
||||
for (y = 0; y < 1000; y++) {
|
||||
mp_rand (&a, x);
|
||||
for (x = 4; x <= 128; x += 4) {
|
||||
mp_rand (&a, x);
|
||||
for (y = 0; y < 10000; y++) {
|
||||
mp_sqr (&a, &b);
|
||||
}
|
||||
}
|
||||
@ -52,20 +52,54 @@ time_sqr (void)
|
||||
return clock () - t1;
|
||||
}
|
||||
|
||||
clock_t
|
||||
time_expt (void)
|
||||
{
|
||||
clock_t t1;
|
||||
int x, y;
|
||||
mp_int a, b, c, d;
|
||||
|
||||
mp_init (&a);
|
||||
mp_init (&b);
|
||||
mp_init (&c);
|
||||
mp_init (&d);
|
||||
|
||||
t1 = clock ();
|
||||
for (x = 4; x <= 128; x += 4) {
|
||||
mp_rand (&a, x);
|
||||
mp_rand (&b, x);
|
||||
mp_rand (&c, x);
|
||||
if (mp_iseven (&c) != 0) {
|
||||
mp_add_d (&c, 1, &c);
|
||||
}
|
||||
for (y = 0; y < 10; y++) {
|
||||
mp_exptmod (&a, &b, &c, &d);
|
||||
}
|
||||
}
|
||||
mp_clear (&d);
|
||||
mp_clear (&c);
|
||||
mp_clear (&b);
|
||||
mp_clear (&a);
|
||||
|
||||
return clock () - t1;
|
||||
}
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
int best_mult, best_square;
|
||||
clock_t best, ti;
|
||||
int best_mult, best_square, best_exptmod;
|
||||
clock_t best, ti;
|
||||
FILE *log;
|
||||
|
||||
best_mult = best_square = 0;
|
||||
best_mult = best_square = best_exptmod = 0;
|
||||
|
||||
/* tune multiplication first */
|
||||
log = fopen ("mult.log", "w");
|
||||
best = CLOCKS_PER_SEC * 1000;
|
||||
for (KARATSUBA_MUL_CUTOFF = 8; KARATSUBA_MUL_CUTOFF <= 128;
|
||||
KARATSUBA_MUL_CUTOFF++) {
|
||||
for (KARATSUBA_MUL_CUTOFF = 8; KARATSUBA_MUL_CUTOFF <= 128; KARATSUBA_MUL_CUTOFF++) {
|
||||
ti = time_mult ();
|
||||
printf ("%4d : %9lu\r", KARATSUBA_MUL_CUTOFF, ti);
|
||||
fprintf (log, "%d, %lu\n", KARATSUBA_MUL_CUTOFF, ti);
|
||||
fflush (stdout);
|
||||
if (ti < best) {
|
||||
printf ("New best: %lu, %d \n", ti, KARATSUBA_MUL_CUTOFF);
|
||||
@ -73,13 +107,15 @@ main (void)
|
||||
best_mult = KARATSUBA_MUL_CUTOFF;
|
||||
}
|
||||
}
|
||||
fclose (log);
|
||||
|
||||
/* tune squaring */
|
||||
log = fopen ("sqr.log", "w");
|
||||
best = CLOCKS_PER_SEC * 1000;
|
||||
for (KARATSUBA_SQR_CUTOFF = 8; KARATSUBA_SQR_CUTOFF <= 128;
|
||||
KARATSUBA_SQR_CUTOFF++) {
|
||||
for (KARATSUBA_SQR_CUTOFF = 8; KARATSUBA_SQR_CUTOFF <= 128; KARATSUBA_SQR_CUTOFF++) {
|
||||
ti = time_sqr ();
|
||||
printf ("%4d : %9lu\r", KARATSUBA_SQR_CUTOFF, ti);
|
||||
fprintf (log, "%d, %lu\n", KARATSUBA_SQR_CUTOFF, ti);
|
||||
fflush (stdout);
|
||||
if (ti < best) {
|
||||
printf ("New best: %lu, %d \n", ti, KARATSUBA_SQR_CUTOFF);
|
||||
@ -87,10 +123,30 @@ main (void)
|
||||
best_square = KARATSUBA_SQR_CUTOFF;
|
||||
}
|
||||
}
|
||||
fclose (log);
|
||||
|
||||
/* tune exptmod */
|
||||
KARATSUBA_MUL_CUTOFF = best_mult;
|
||||
KARATSUBA_SQR_CUTOFF = best_square;
|
||||
|
||||
log = fopen ("expt.log", "w");
|
||||
best = CLOCKS_PER_SEC * 1000;
|
||||
for (MONTGOMERY_EXPT_CUTOFF = 8; MONTGOMERY_EXPT_CUTOFF <= 192; MONTGOMERY_EXPT_CUTOFF++) {
|
||||
ti = time_expt ();
|
||||
printf ("%4d : %9lu\r", MONTGOMERY_EXPT_CUTOFF, ti);
|
||||
fflush (stdout);
|
||||
fprintf (log, "%d : %lu\r", MONTGOMERY_EXPT_CUTOFF, ti);
|
||||
if (ti < best) {
|
||||
printf ("New best: %lu, %d\n", ti, MONTGOMERY_EXPT_CUTOFF);
|
||||
best = ti;
|
||||
best_exptmod = MONTGOMERY_EXPT_CUTOFF;
|
||||
}
|
||||
}
|
||||
fclose (log);
|
||||
|
||||
printf
|
||||
("\n\n\nKaratsuba Multiplier Cutoff: %d\nKaratsuba Squaring Cutoff: %d\n",
|
||||
best_mult, best_square);
|
||||
("\n\n\nKaratsuba Multiplier Cutoff: %d\nKaratsuba Squaring Cutoff: %d\nMontgomery exptmod Cutoff: %d\n",
|
||||
best_mult, best_square, best_exptmod);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
4
makefile
4
makefile
@ -1,6 +1,6 @@
|
||||
CFLAGS += -I./ -Wall -W -Wshadow -O3 -fomit-frame-pointer -funroll-loops
|
||||
|
||||
VERSION=0.13
|
||||
VERSION=0.14
|
||||
|
||||
default: libtommath.a
|
||||
|
||||
@ -60,7 +60,7 @@ docs: docdvi
|
||||
rm -f bn.log bn.aux bn.dvi
|
||||
|
||||
clean:
|
||||
rm -f *.pdf *.o *.a *.exe etclib/*.o demo/demo.o test ltmtest mpitest mtest/mtest mtest/mtest.exe \
|
||||
rm -f *.pdf *.o *.a *.obj *.lib *.exe etclib/*.o demo/demo.o test ltmtest mpitest mtest/mtest mtest/mtest.exe \
|
||||
bn.log bn.aux bn.dvi *.log *.s mpi.c
|
||||
cd etc ; make clean
|
||||
|
||||
|
26
makefile.msvc
Normal file
26
makefile.msvc
Normal file
@ -0,0 +1,26 @@
|
||||
#MSVC Makefile
|
||||
#
|
||||
#Tom St Denis
|
||||
|
||||
CFLAGS = /I. /Ogityb2 /Gs /DWIN32 /W3
|
||||
|
||||
default: library
|
||||
|
||||
OBJECTS=bncore.obj bn_mp_init.obj bn_mp_clear.obj bn_mp_exch.obj bn_mp_grow.obj bn_mp_shrink.obj \
|
||||
bn_mp_clamp.obj bn_mp_zero.obj bn_mp_set.obj bn_mp_set_int.obj bn_mp_init_size.obj bn_mp_copy.obj \
|
||||
bn_mp_init_copy.obj bn_mp_abs.obj bn_mp_neg.obj bn_mp_cmp_mag.obj bn_mp_cmp.obj bn_mp_cmp_d.obj \
|
||||
bn_mp_rshd.obj bn_mp_lshd.obj bn_mp_mod_2d.obj bn_mp_div_2d.obj bn_mp_mul_2d.obj bn_mp_div_2.obj \
|
||||
bn_mp_mul_2.obj bn_s_mp_add.obj bn_s_mp_sub.obj bn_fast_s_mp_mul_digs.obj bn_s_mp_mul_digs.obj \
|
||||
bn_fast_s_mp_mul_high_digs.obj bn_s_mp_mul_high_digs.obj bn_fast_s_mp_sqr.obj bn_s_mp_sqr.obj \
|
||||
bn_mp_add.obj bn_mp_sub.obj bn_mp_karatsuba_mul.obj bn_mp_mul.obj bn_mp_karatsuba_sqr.obj \
|
||||
bn_mp_sqr.obj bn_mp_div.obj bn_mp_mod.obj bn_mp_add_d.obj bn_mp_sub_d.obj bn_mp_mul_d.obj \
|
||||
bn_mp_div_d.obj bn_mp_mod_d.obj bn_mp_expt_d.obj bn_mp_addmod.obj bn_mp_submod.obj \
|
||||
bn_mp_mulmod.obj bn_mp_sqrmod.obj bn_mp_gcd.obj bn_mp_lcm.obj bn_fast_mp_invmod.obj bn_mp_invmod.obj \
|
||||
bn_mp_reduce.obj bn_mp_montgomery_setup.obj bn_fast_mp_montgomery_reduce.obj bn_mp_montgomery_reduce.obj \
|
||||
bn_mp_exptmod_fast.obj bn_mp_exptmod.obj bn_mp_2expt.obj bn_mp_n_root.obj bn_mp_jacobi.obj bn_reverse.obj \
|
||||
bn_mp_count_bits.obj bn_mp_read_unsigned_bin.obj bn_mp_read_signed_bin.obj bn_mp_to_unsigned_bin.obj \
|
||||
bn_mp_to_signed_bin.obj bn_mp_unsigned_bin_size.obj bn_mp_signed_bin_size.obj bn_radix.obj \
|
||||
bn_mp_xor.obj bn_mp_and.obj bn_mp_or.obj bn_mp_rand.obj bn_mp_montgomery_calc_normalization.obj
|
||||
|
||||
library: $(OBJECTS)
|
||||
lib /out:tommath.lib $(OBJECTS)
|
@ -41,7 +41,7 @@ void rand_num(mp_int *a)
|
||||
unsigned char buf[512];
|
||||
|
||||
top:
|
||||
size = 1 + ((fgetc(rng)*fgetc(rng)) % 96);
|
||||
size = 1 + ((fgetc(rng)*fgetc(rng)) % 512);
|
||||
buf[0] = (fgetc(rng)&1)?1:0;
|
||||
fread(buf+1, 1, size, rng);
|
||||
for (n = 0; n < size; n++) {
|
||||
@ -57,7 +57,7 @@ void rand_num2(mp_int *a)
|
||||
unsigned char buf[512];
|
||||
|
||||
top:
|
||||
size = 1 + ((fgetc(rng)*fgetc(rng)) % 96);
|
||||
size = 1 + ((fgetc(rng)*fgetc(rng)) % 512);
|
||||
buf[0] = (fgetc(rng)&1)?1:0;
|
||||
fread(buf+1, 1, size, rng);
|
||||
for (n = 0; n < size; n++) {
|
||||
@ -72,6 +72,8 @@ int main(void)
|
||||
int n;
|
||||
mp_int a, b, c, d, e;
|
||||
char buf[4096];
|
||||
|
||||
static int tests[] = { 11, 12 };
|
||||
|
||||
mp_init(&a);
|
||||
mp_init(&b);
|
||||
@ -89,7 +91,7 @@ int main(void)
|
||||
}
|
||||
|
||||
for (;;) {
|
||||
n = 4; // fgetc(rng) % 11;
|
||||
n = fgetc(rng) % 13;
|
||||
|
||||
if (n == 0) {
|
||||
/* add tests */
|
||||
@ -235,7 +237,24 @@ int main(void)
|
||||
printf("%s\n", buf);
|
||||
mp_todecimal(&c, buf);
|
||||
printf("%s\n", buf);
|
||||
}
|
||||
} else if (n == 11) {
|
||||
rand_num(&a);
|
||||
mp_mul_2(&a, &a);
|
||||
mp_div_2(&a, &b);
|
||||
printf("div2\n");
|
||||
mp_todecimal(&a, buf);
|
||||
printf("%s\n", buf);
|
||||
mp_todecimal(&b, buf);
|
||||
printf("%s\n", buf);
|
||||
} else if (n == 12) {
|
||||
rand_num2(&a);
|
||||
mp_mul_2(&a, &b);
|
||||
printf("mul2\n");
|
||||
mp_todecimal(&a, buf);
|
||||
printf("%s\n", buf);
|
||||
mp_todecimal(&b, buf);
|
||||
printf("%s\n", buf);
|
||||
}
|
||||
}
|
||||
fclose(rng);
|
||||
return 0;
|
||||
|
36
timings.txt
36
timings.txt
@ -1,36 +0,0 @@
|
||||
CLOCKS_PER_SEC == 1000
|
||||
Adding 128-bit => 14534883/sec, 688 ticks
|
||||
Adding 256-bit => 11037527/sec, 906 ticks
|
||||
Adding 512-bit => 8650519/sec, 1156 ticks
|
||||
Adding 1024-bit => 5871990/sec, 1703 ticks
|
||||
Adding 2048-bit => 3575259/sec, 2797 ticks
|
||||
Adding 4096-bit => 2018978/sec, 4953 ticks
|
||||
Subtracting 128-bit => 11025358/sec, 907 ticks
|
||||
Subtracting 256-bit => 9149130/sec, 1093 ticks
|
||||
Subtracting 512-bit => 7440476/sec, 1344 ticks
|
||||
Subtracting 1024-bit => 5078720/sec, 1969 ticks
|
||||
Subtracting 2048-bit => 3168567/sec, 3156 ticks
|
||||
Subtracting 4096-bit => 1833852/sec, 5453 ticks
|
||||
Squaring 128-bit => 3205128/sec, 78 ticks
|
||||
Squaring 256-bit => 1592356/sec, 157 ticks
|
||||
Squaring 512-bit => 696378/sec, 359 ticks
|
||||
Squaring 1024-bit => 266808/sec, 937 ticks
|
||||
Squaring 2048-bit => 85999/sec, 2907 ticks
|
||||
Squaring 4096-bit => 21949/sec, 11390 ticks
|
||||
Multiplying 128-bit => 3205128/sec, 78 ticks
|
||||
Multiplying 256-bit => 1592356/sec, 157 ticks
|
||||
Multiplying 512-bit => 615763/sec, 406 ticks
|
||||
Multiplying 1024-bit => 192752/sec, 1297 ticks
|
||||
Multiplying 2048-bit => 53510/sec, 4672 ticks
|
||||
Multiplying 4096-bit => 14801/sec, 16890 ticks
|
||||
Exponentiating 513-bit => 531/sec, 47 ticks
|
||||
Exponentiating 769-bit => 177/sec, 141 ticks
|
||||
Exponentiating 1025-bit => 88/sec, 282 ticks
|
||||
Exponentiating 2049-bit => 13/sec, 1890 ticks
|
||||
Exponentiating 2561-bit => 6/sec, 3812 ticks
|
||||
Exponentiating 3073-bit => 4/sec, 6031 ticks
|
||||
Exponentiating 4097-bit => 1/sec, 12843 ticks
|
||||
Inverting mod 128-bit => 19160/sec, 5219 ticks
|
||||
Inverting mod 256-bit => 8290/sec, 12062 ticks
|
||||
Inverting mod 512-bit => 3565/sec, 28047 ticks
|
||||
Inverting mod 1024-bit => 1305/sec, 76594 ticks
|
36
timings2.txt
36
timings2.txt
@ -1,36 +0,0 @@
|
||||
CLOCKS_PER_SEC == 1000
|
||||
Adding 128-bit => 15600624/sec, 641 ticks
|
||||
Adding 256-bit => 12804097/sec, 781 ticks
|
||||
Adding 512-bit => 10000000/sec, 1000 ticks
|
||||
Adding 1024-bit => 7032348/sec, 1422 ticks
|
||||
Adding 2048-bit => 4076640/sec, 2453 ticks
|
||||
Adding 4096-bit => 2424242/sec, 4125 ticks
|
||||
Subtracting 128-bit => 10845986/sec, 922 ticks
|
||||
Subtracting 256-bit => 9416195/sec, 1062 ticks
|
||||
Subtracting 512-bit => 7710100/sec, 1297 ticks
|
||||
Subtracting 1024-bit => 5159958/sec, 1938 ticks
|
||||
Subtracting 2048-bit => 3299241/sec, 3031 ticks
|
||||
Subtracting 4096-bit => 1987676/sec, 5031 ticks
|
||||
Squaring 128-bit => 3205128/sec, 78 ticks
|
||||
Squaring 256-bit => 1592356/sec, 157 ticks
|
||||
Squaring 512-bit => 696378/sec, 359 ticks
|
||||
Squaring 1024-bit => 266524/sec, 938 ticks
|
||||
Squaring 2048-bit => 86505/sec, 2890 ticks
|
||||
Squaring 4096-bit => 22471/sec, 11125 ticks
|
||||
Multiplying 128-bit => 3205128/sec, 78 ticks
|
||||
Multiplying 256-bit => 1592356/sec, 157 ticks
|
||||
Multiplying 512-bit => 615763/sec, 406 ticks
|
||||
Multiplying 1024-bit => 190548/sec, 1312 ticks
|
||||
Multiplying 2048-bit => 54418/sec, 4594 ticks
|
||||
Multiplying 4096-bit => 14897/sec, 16781 ticks
|
||||
Exponentiating 513-bit => 531/sec, 47 ticks
|
||||
Exponentiating 769-bit => 177/sec, 141 ticks
|
||||
Exponentiating 1025-bit => 84/sec, 297 ticks
|
||||
Exponentiating 2049-bit => 13/sec, 1875 ticks
|
||||
Exponentiating 2561-bit => 6/sec, 3766 ticks
|
||||
Exponentiating 3073-bit => 4/sec, 6000 ticks
|
||||
Exponentiating 4097-bit => 1/sec, 12750 ticks
|
||||
Inverting mod 128-bit => 17301/sec, 578 ticks
|
||||
Inverting mod 256-bit => 8103/sec, 1234 ticks
|
||||
Inverting mod 512-bit => 3422/sec, 2922 ticks
|
||||
Inverting mod 1024-bit => 1330/sec, 7516 ticks
|
@ -1,5 +0,0 @@
|
||||
Exponentiating 513-bit => 531/sec, 94 ticks
|
||||
Exponentiating 769-bit => 187/sec, 266 ticks
|
||||
Exponentiating 1025-bit => 88/sec, 562 ticks
|
||||
Exponentiating 2049-bit => 13/sec, 3719 ticks
|
||||
|
Loading…
Reference in New Issue
Block a user