/* optimized version of Skipjack algorithm
*
* the appropriate g-function is inlined for each round
*
* the data movement is minimized by rotating the names of the
* variables w1..w4, not their contents (saves 3 moves per round)
*
* the loops are completely unrolled (needed to staticize choice of g)
*
* compiles to about 470 instructions on a Sparc (gcc -O)
* which is about 58 instructions per byte, 14 per round.
* gcc seems to leave in some unnecessary and with 0xFF operations
* but only in the latter part of the functions. Perhaps it
* runs out of resources to properly optimize long inlined function?
* in theory should get about 11 instructions per round, not 14
*/
/*
* Further optimized test implementation of SKIPJACK algorithm
* Mark Tillotson <markt@chaos.org.uk>, 25 June 98
* Optimizations suit RISC (lots of registers) machine best.
*
* based on unoptimized implementation of
* Panu Rissanen <bande@lut.fi> 960624
*
* SKIPJACK and KEA Algorithm Specifications
* Version 2.0
* 29 May 1998
*/