First of all, it had a failure in SEED:
LTC_KSEED failed for x=0, I got:
expected actual (ciphertext)
5e == 5e
ba == ba
c6 == c6
e0 == e0
05 != 00
4e != 00
16 != 00
68 != 00
19 == 19
af == af
f1 == f1
cc == cc
6d != 00
34 != 00
6c != 00
db != 00
Since SEED uses the 32H macros, this is really analogous to the
problem I saw with the 64H macros in Camellia with gcc. Not sure why
gcc only had a problem with 64H and not 32H, but since this is an
interaction with the optimizer, it's not going to happen every time
the macro is used (hence why the store tests pass; only when you get
into the complexity of a real cipher do you start having problems) and
it makes sense it will vary from compiler to compiler.
Anyway, I went ahead and added the ability to use __builtin_bswap32,
in addition to __builtin_bswap64, which I already did in a previous
commit. This solves the problem for clang, although I had to add new
logic to detect the bswap builtins in clang, since it has a different
way to detect them than gcc (see the comments in the code). The
detection logic was complicated enough, and applied to both the 32H
and 64H macros, so I factored out the detection logic into
tomcrypt_cfg.h.