|
 |
|
SQL Server Tips by Burleson |
CRC32 Checksum Algorithm
The Cyclic Redundancy Checksum is another form of one-way function
used in communication protocols. The hash value produced is much
smaller than the algorithms described so far and in fact only uses
32 bits. It is used primarily in network protocols and for simple
computer files checksums. CRCs are used to detect errors in
transmission or communication. Areas where CRC32 checksums are used
in, includes Ethernet, Fiber Distributed Data Interface (FDDI),
PKZip and WinZip file checksums, PNG image file format and the
ZModem serial communications protocol. CRC calculations are
constructed in ways such that anticipated types of errors, e.g., due
to noise in transmission channels, are almost always detected. CRC's
cannot, however, be safely relied upon to verify data integrity i.e.
that no changes whatsoever have occurred, since through intentional
modification it is possible to cause changes that will not be
detected through the use of a CRC. The hash functions described
earlier should be used for this. Because MS CryptoAPI does not
provide an implementation for this algorithm, the algorithm is
implemented directly in the XP_CRPYPT code base.
The CRC32 hash value is calculated using the following pseudo code:
shiftregister = initial value
(commonly 0x0000... or 0xFFFF...)
while bits remain in string to calculate CRC for
if MSB of shiftregister is set:
shiftregister = (shiftregister leftshift 1) xor polynomial
("leftshift" assumes big-endian architecture)
else:
shiftregister = shiftregister leftshift 1
xor next bit from the string into LSB of shiftregister
output shiftregister
The implementation of CRC32 provided in XP_CRPYT uses a lookup table
of all 256 possible bytes, which allows for the code to operate on a
byte at a time. This provides an eight-fold increase in the speed of
the algorithm compared to operating on a bit at a time. To avoid
having to calculate the lookup table each time you want to calculate
a CRC, a separate program was developed to generate the lookup table
vales and the generated values are defined in a global array. This
means that the table is initialized only once when SQL Server
initially loads the XP DLL. The total data size used by the table is
1 Kilobyte. The CRC32 hash value returned by this function is
implemented to return the same values as used in Zip files. With the
use of the lookup table the code implementation is as follows:
DWORD dwCRC = 0xFFFFFFFF;
for (ULONG j=0; j<m_pParameterData[DATA_TO_HASH_INDEX].m_cbActualLen;
j++)
dwCRC = (dwCRC >> 8) ^ g_dwCRC32LookupTable[(dwCRC & 0xFF) ^
m_pParameterData[DATA_TO_HASH_INDEX].m_pData[j]];
dwCRC ^= 0xFFFFFFFF;
return (ParamSetOutput(HASH_INDEX+1, (BYTE*) &dwCRC, sizeof(dwCRC),
FALSE) == SUCCEED);
The above book excerpt is from:
Super SQL
Server Systems
Turbocharge Database Performance with C++ External Procedures
ISBN:
0-9761573-2-2
Joseph Gama, P. J. Naughter
http://www.rampant-books.com/book_2005_2_sql_server_external_procedures.htm |