This produces slightly better performance than the inline assembly,
and has the added benefit that it should be portable to other systems
that use gcc, not just x86-64.
Here are the results on my "AMD Athlon(tm) 7450 Dual-Core Processor"
with "gcc (Ubuntu 4.3.3-5ubuntu4) 4.3.3":
with portable 64H macros:
camellia : Schedule at 1659
camellia [ 23]: Encrypt at 431, Decrypt at 434
whirlpool : Process at 55
with inline assembly (with "memory clobber" for correctness):
camellia : Schedule at 1380
camellia [ 23]: Encrypt at 406, Decrypt at 403
whirlpool : Process at 50
with __builtin_bswap64:
camellia : Schedule at 1352
camellia [ 23]: Encrypt at 396, Decrypt at 391
whirlpool : Process at 46
This had been causing Camellia (the only cipher that uses these
macros) to fail when compiling "out-of-the-box" with gcc version
"4.3.3-5ubuntu4". I think because the compiler had no idea any memory
access was going on in these macros.
Adding "memory" as a clobber solves the problem, but is probably
overkill. I suspect that if we specify the constraint for y
differently, we could get rid of both "memory" and __volatile__, which
would allow the compiler to optimize much more.
Also, in gcc versions that support it, we should probably use the
bswap builtins instead.
As near as I can tell, LibTomCrypt doesn't provide any way to tell
which cipher failed when it reports a cipher test failure. For
example, I was getting:
Algorithm failed test vectors. (5)
cipher_hash_test.c:14:cipher_descriptor[x].test()
But there's no way to tell what value x has, and even if there was, it
would take a bit of digging to determine which algorithm that
corresponds to. So, I added a variant of the DO() macro, DOX(), which
takes an additional string argument which is displayed on failure. So
now I get:
Algorithm failed test vectors. (5) - camellia
cipher_hash_test.c:14:cipher_descriptor[x].test()
"make clean" was deleting "doc/*.pdf", despite the fact that there
were two comments (one above and one below) stating that it did not.
Since doc/crypt.pdf is checked into git, running "make clean" made my
git state dirty, which seems undesirable.
I took sort of a compromise position and had "make clean" continue to
delete any other .pdf files in doc (such as refman.pdf), but
explicitly not delete crypt.pdf.
This line:
rm -f `find . -type f | grep "[.]lo" | xargs`
was deleting crypt.lof, which seemed undesirable. One solution would
be to end the grep expression with "$", but it seemed more
straightforward just to pass "-name" to "find", rather than piping
through grep.
This seemed to be the only place in the code that was using this
particular transposition. And, indeed, when compiling with
"GMP_DESC", it looks like it is necessary to disable Diffie-Hellman.
(Otherwise, the test fails for me.)
addmod and submod are moved to the end of the math descriptor, in order
to be able to run existing software against a new version of ltc without need
to rebuild the software.