8 trang 59 lượt tải

Performance analysis and truncation methodologies | Đại Học Nội Vụ Hà Nội

118

Abstract—With the emergence of pervasive 64 bit computing we observe that it is more cost effective to compute a SHA-512 than it is to compute a SHA-256 over a given size of data. Performance analysis and truncation methodologies | Đại Học Nội Vụ Hà Nội. Tài liệu được sưu tầm gồm 7 trang, giúp bạn tham khảo, ôn tập và đạt kết quả cao!

Môn: Trí tuệ nhân tạo(huha) 10 tài liệu

Trường: Đại Học Nội Vụ Hà Nội 1.3 K tài liệu

Tác giả:

Linh bÙI

2 tháng trước

Tải xuống Chia sẻ Báo cáo

Danh sách Quiz

lOMoARcPSD| 59062190

Intel Architecture Group,

Microprocessor and Chipset

Development

Intel Corporation, Israel

Development Center

Haifa, Israel

shay.gueron@intel.com

Abstract—With the emergence of pervasive 64 bit computing

we observe that it is more cost effective to compute a SHA-

512 than it is to compute a SHA-256 over a given size of data.

We propose a standard way to use SHA-512 and truncate its

output to 256 bits. For 64 bit architectures, this would yield

a more efficient 256 bit hashing algorithm, than the current

SHA-256. We call this method SHA-512/256. We also provide

method for reducing the size of the SHA-512 constants table

that an implementation will need to store.

Keywords-hash algorithms, SHA-512.

I. INTRODUCTION

Robust and fast security functionality is basic tenant

for secure computer transactions. Hashing algorithms

have long been the poor-man of the community, with their

security receiving less attention than standard encryption

algorithms and with little attention paid to their speed. The

attacks against SHA-1 reversed this situation and there are

many new proposals being evaluated in response to the

NIST SHA-3 competition. In the aftermath of the SHA-1

attacks the advice NIST produced was to move to SHA-

256 [1]. As a result, many standards and products have

started to move towards larger hash sizes, although this

may be a somewhat protracted process as the SHA-3

competition now adds additional dimensions to feature

selection and future supportability issues. This movement

does not come without its costs as SHA-256 is about 2.2

times slower than SHA-1.

The reason why SHA-512 is faster than SHA-256 on

64bit machines is that has 37.5% less rounds per byte (80

rounds operating on 128 byte blocks) compared to SHA-

256 (64 rounds operating on 64 byte blocks), where the

operations use 64 bit integer arithmetic. The adoption

across the breadth of our product range of 64 bit ALU’s

make it possible to achieve better security using SHA-512

in less time than it takes to compute a SHA-256 hash.

However storing a SHA-512 bit hash is expensive,

especially in a constrained hardware environment, such as

a state-of-the-art processor. SHA-384 does reduce this

storage requirement somewhat by truncating the final

result of a SHA-512 to 384 bits. But by truncating the

result of SHA512 operation to 256 bits it is possible to

balance the cost of providing the necessary additional

security/storage against the performance cost of

calculating the hash.

We believe that adding SHA-512/256 to the SHA

portfolio would provide implementers with performance/cost

characteristics hitherto unavailable to them.

II. PERFORMANCE OF SHA-512 AND SHA-256

The performance of SHA-256 and SHA-512 depends on

the length of the hashed message. Here we provide a

summary.

Generally, SHA-256 and SHA-512 can be viewed as a

single invocation of an _init() function (that initializes the

eight 64-bit variables h0, h1, h2, h3, h4, h5, h6, h7), followed

by a sequence of invocations of an _update() function, and

an invocation a _finalize() function.

The _finalize() function itself consists of one or two

invocations of _update(), depending on the message’s length.

In addition, there are some operations to create a formatted

“last block(s)” (also called “padding”).

The _update() functions for SHA-256 and SHA-512 are

different and, even more importantly, operate on different

block sizes: 64 bytes for SHA-256 and 128 bytes for

SHA512.

From a performance standpoint the contribution of the

_init() function and the last block padding are negligible.

Therefore, the performance of SHA-256 and SHA-512 can

be quite accurately approximated from the performance of

their respective _update() functions, and the number of

invocations.

The number invocations of the _update() function

depends on the message length as follows.

SHA-256:

Let M be a message of x bytes, where: x

= 64n + r, 0≤m<64.

If r ≤ 55, the number of calls to _update() is (n+1)

If r > 55, the number of calls to _update() is (n+2)

Denote n = floor (x/64), r = x mod 64, and the cost

(in CPU cycles) of one SHA-256 _update()

function by

UPDATE256. The number of cycles for computing the

SHA-512/256

Shay Gueron

Department of Mathematics

University of Haifa

Haifa, Israel

shay@math.haifa.ac.il

Simon Johnson

Intel Architecture Group,

Intel Corporation

USA

simon.p.johnson@intel.com

Jesse Walker

Security Research Lab, Intel Labs,

Intel Corporation

USA

jesse.walker@intel.com

2011 Eighth International Conference on Information Technology: New Generations

978-0-7695-4367-3/11 $26.00 ' 2011

IEEE

DOI 10.1109/ITNG.2011.69

lOMoARcPSD| 59062190

355

SHA-256 of M is approximated by

UPDATE256 (n + 1 + floor(r/55)). (1)

SHA-512:

Let M be a message of y bytes, where:

y = 128m + s, 0≤s<128.

If s ≤ 111, the number of calls to _update() is (m+1)

If s > 111, the number of calls to _update() is (m+2)

Denote m = floor (x/64), s = y mod 64, and the cost (in

CPU cycles) of one SHA-512 _update() function by

UPDATE512. The number of cycles for computing the

SHA-512 of M is approximated by

UPDATE512(m + 1 + floor(s/ 111)) (2)

As an example, Figure 1 shows the SHA-512 flow of such

a message, and it also illustrates why the overheads beyond

the _update() invocations are negligible with respect to

performance.

Input:

a pointer to the hash string (8 * 64bit long

words), a pointer to the message whose byte length

is a multiple of 128.

Output: The hash string holding the SHA-

512 digest

of the message.

Prototype:

void SHA-512_128byte_blocks(uint64_t hash[8],

uint8_t msg[256], int byte_length)

Flow:

SHA-512Init(hash)

last_block = zero_string

last_block[byte 0] = 0x80

last_block[qword 15] =

big_endian(byte_length*8) append(msg,

last_block) for i=0 to byte_length/128 SHA-

512Update(hash, msg) msg = msg+128 end for

Output: The hash now holds the digest of the

message

Figure 1. SHA-512 of a message whose length is a multiple of 128 bytes

(pseudo code)

For comparison purposes we show the performance, in

total cycles per block and in CPU cycles per byte, of the

_update() functions for both SHA-256 and SHA-512,

measured on the 2010 Intel® Core™ architecture, Xeon

X5670 processor. We also show the total number of CPU

cycles required for hashing a 1024 bytes message. The

reported measurements were carried out on an Intel Xeon

X5670 processor running at 2.67 GHz. The operating system

was Linux (OpenSuse 11.1 64 bits). To isolate the

performance of the functions that we measured, we disabled

Intel® Turbo Boost Technology, Intel® Hyper-Threading

Technology, and Enhanced Intel Speedstep® Technology. No

X server and no network daemon were running. The results

are shown in Tables 1 and 2.

As an example for optimized code (unrolled assembler),

we used OpenSSL version 1.0.0a [3]. For “compact” code,

we used C code we developed ourselves.

TABLE I. SHA-256 PERFORMANCE

Cycles

Total

Per Byte

SHA-256 Update

(Compact C code)

1,863 29.11

SHA-256 Update

(OpenSSL unrolled asm code)

1,166 18.22

SHA-256 of a 1024 bytes message

(Compact C code)

33,757 32.97

SHA-256 of a 1024 bytes message

(OpenSSL unrolled asm code)

19,769 19.30

TABLE II. SHA-512 PERFORMANCE

Cycles

Total

Per Byte

SHA-512 Update

(Compact C code)

2,473 19.32

SHA-512 Update

(OpenSSL unrolled asm code)

1,483 11.58

SHA-512 on 1024 bytes message

(Compact C code)

20,928

20.43

SHA-512 on 1024 bytes message

(OpenSSL unrolled asm code)

13,392 13.07

In both examples we see that unrolling the code to

carefully take advantage of the parallelism in the CPU micro-

architecture, results in a performance improvement of ~37%

(for SHA-256) and 37-40% (for SHA-512).

To illustrate the accuracy of the performance

approximations, apply Equation (2) to a 1024 byte message

(m=8, s=0), with UPDATE512 = 1483 (Table 2, unrolled

code). The approximated total cycles count for SHA-512 is

13,347 cycles, which indeed closely approximates the

actually measured 13,392 cycles. The small differences can

be attributed to overheads.

When comparing apples-to-apples implementations,

SHA-512 performs ~50% more efficiently SHA-256. Even

when comparing “best” against “worst” implementations, the

worst SHA-512 performance is within ~6% of best SHA-256

implementation.

III. THE SHA-512/256 TRUNCATION

In this section we will show how to truncate SHA-512 to

256 bits. The result of this process we refer to as

SHA512/256.

SHA-384 [2] already provides an existing example for

truncation of SHA-512 to a shorter digest size. The

computations of SHA-384 are exactly the same as SHA-512,

and in order to signify that a hash was performed by a

truncated form of SHA-512, the initial hash values are set to

different constants. In other words, the only difference

lOMoARcPSD| 59062190

between SHA-512 and SHA-384 is in the _init() function,

and of course the truncation itself.

We propose to apply the same technique to the truncation

of SHA-512 to a 256 bit digest.

In SHA-512 the init() function sets the initial state to the

first 64 bits of the fractional parts of the square roots of the

first 8 prime numbers. In SHA-384 these constants are

replaced with the fractional parts of the square roots of the

ninth through sixteenth prime numbers.

By analogy, we propose that the initialization constants

for SHA-512/256 would be the fractional parts of the

seventeenth through twenty-fourth prime numbers. These

values are shown in Table 3. When the hashing has been

completed, and a 512 bit result is obtained, the truncated

digest would be defined the lower 256 bits of that result.

TABLE III. SHA-512/256 PROPOSED INITIAL CONSTANTS

Prime

Number

Prime

Value

The first 64bits of the fractional

part of the square roots of the

primes

ae5f9156e7b6d99b

cf6c85d39d1a1e15

2f73477d6a4563ca

6d1826cafd82e1ed

8b43d4570a51b936

e360b596dc380c3f

1c456002ce13e9f8

6f19633143a0af0e

Figure 2 provides a test vector example.

The 256 bytes message (represented as a sequence

of bytes; byte 0 is first, byte 255 is last): 00

01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f

10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f

20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f

30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f

40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f

50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f

60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f

70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f

80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f

90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f

a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af

b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf

c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf

d0 d1 d2 d3

d4 d5 d6 d7 d8 d9 da db dc dd de df

e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef

f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff

The SHA-

512/256 hash value (before truncation):

h0 = 4ff7ecb3e7c23b55

h1 = 9974eba17a3d1a62

h2 = 0504f18be2e472ea

h3 = c5c5cbf75b3b7550

h4 = 27a8af7dc7dc9845

h5 = 8cfc76997dc50cfd

h6 = f4f500cc1830f561

h7 = bf2abd3732fdf66a

The SHA512/256 hash value (256 bits):

h0 = 4ff7ecb3e7c23b55

h1 = 9974eba17a3d1a62

h2 = 0504f18be2e472ea

h3 = c5c5cbf75b3b7550

For comparison, the SHA-

512 hash value of the

same message is:

h0 = 1e7b80bc8edc552c

h1 = 8feeb2780e111477

h2 = e5bc70465fac1a77

h3 = b29b35980c3f0ce4

h4 = a036a6c946203682

h5 = 4bd56801e62af7e9

h6 = feba5c22ed8a5af8

h7 = 77bf7de117dcac6d

(note: completely different values are

obtained)

Figure 2. A SHA-512/256 example.

IV. CALCULATING THE SHA-512 CONSTANTS

WITH A SMALLER LOOKUP TABLE

The downside of implementing SHA-512 is that it

requires a table of eighty 64-bit constants (a 640 bytes lookup

table). For comparison, SHA-256 requires only sixty four 32

bit constant (a 256 bytes lookup table).

In some implementations the cost of storing data for the

lookup table can be exceptionally high. In such cases, the

storage requirement of SHA-512, namely for 384 more bytes

than SHA-256, would be considered as a disadvantage.

For SHA-256, there is a way to reduce the storage

requirement by computing the sixty four constants which are

defined to be the first 32 bits of the fractional parts of the

cube roots of the first sixty four prime numbers. One way to

compute these cube roots is to use Newton-Raphson

iterations, which, on 64 bit architectures, quickly converge to

provide the first 32 bits of the result.

lOMoARcPSD| 59062190

357

Unfortunately, this is not the case for SHA-512, because

the SHA-512 constants are defined to be the first 64 bits of

the fractional part of the cube roots of the first eighty primes.

Performing simple numerical iterations on 64 bit architecture

does not give the required precision.

We propose the following method for obtaining the SHA-

512 constants. They can be approximated, using the Newton-

Raphson Algorithm, up to the last two bytes (in no more than

14 iterations of the algorithm). Therefore, it is sufficient to

store, for each constant, only the two bytes (precomputed)

difference between the result of the NewtonRaphson

iterations and the exact constant. To avoid storing the first

eighty primes, it is enough to store only the difference from

the previous prime number. For example, the difference from

the first prime is (2) is 0, the difference of the next prime (3)

from the previous one (2) is 1, the next difference is 5-3=2

and so on. Since up to the first eighty primes, the largest

difference is 24, it follows that all differences can be

represented by 4 bits, so that pairs of differences can be

stored in a single byte.

Altogether, this method uses only 2.5 bytes for each

constant (instead of 8 bytes if the constants are stored),

therefore, reducing the table size from 640 bytes to only 200

bytes. This method trades a reduced table size with the small

cost of additional code and computations. The

implementation and the associated constants are detailed in

Figure 3 below.

Input: Pointers to three arrays.

Array 1 holds the differences between the first

80 primes (two differences per byte).

Array 2 holds the difference between the result

lOMoARcPSD| 59062190

of the Newton-

Raphson iterations and the desired

constant.

Array 3 a place to store the computed 80 SHA-

512

constants.

Output: The SHA-512 constants

Prototype:

void calculate_SHA-512_constants (uint16_t

deltas[80], uint8_t offsets[40], uint64_t

K[80])

Constants:

offsets =

0x01, 0x22, 0x42, 0x42, 0x46, 0x26, 0x42,

0x46, 0x62, 0x64, 0x26, 0x46, 0x84, 0x24,

0x24, 0xe4, 0x62, 0xa2, 0x66, 0x46, 0x62,

0xa2, 0x42, 0xcc, 0x42, 0x46, 0x2a, 0x66,

0x62, 0x64, 0x2a, 0xe4, 0x24, 0xe6, 0xa2,

0x46, 0x86, 0x64, 0x68, 0x48

Deltas =

0xfe22, 0x05cd, 0xfb2f, 0xfbbc, 0xf538,

0xf019, 0xef9b, 0x0118, 0x0242, 0x0fbe,

0xf28c, 0xf4e2, 0x096f, 0xf6b1, 0xf235,

0x0694, 0x0ad2, 0x05e3, 0x15b5, 0x1c65,

0x0275, 0xe483, 0xfbd4, 0x13b5, 0xdfab,

0xf210, 0xe13f, 0x0ee4, 0x0fc2, 0xe725,

0x026f, 0xee70, 0xeffc, 0x0926, 0xeaed,

0xf3df, 0xe3de, 0xf2a8, 0xeee6, 0xf53b,

0x0364, 0xf001, 0x1791, 0xfe30, 0x1218,

0x2910, 0xe02a, 0xd1b8, 0x10c8, 0xeb53,

0xeb99, 0x08a8, 0x1a63, 0x0acb, 0xe373,

0xf8a3, 0xf2fc, 0xef60, 0x2b72, 0xf9ec,

0x1e28, 0xfde9, 0xf915, 0x132b, 0xe19c,

0x0207, 0xeb1e, 0x1178, 0xefba, 0x18a6,

0x0dae, 0x071b, 0xfd84, 0x2493, 0xfebc,

0x0d4c, 0x02b6, 0xfe2a, 0xfaec,

0x1817

Flow:

double p = 2 //the first

prime for i=0 to 79 if

(i%2 = 1)

//offset is in the second nibble

p = p + (offsets[i/2] & 0x0f) else

//offset in the first

nibble p = p +

(offsets[i/2] >> 4) end if

double n = p/3 for j=0 to 13

n = n - (n

-p)/3n

// correction to 64 bits accuracy

K[i] = fraction(n) * 2

+ deltas[i]

end for

Output:

The constants are now in the K array

Length

256

Length T, encoded as a

1024 bit (Big-Endian)

number.

Here, T=256.

(byte 0 is first, byte 127 is the

last)

000100000000000000000000

000000000000000000000000

0000000000000000

The initial constants IV

512/t

(t=256)

(= SHA-512 (T))

h0 = 2b2b0a74439fba29

h1 = b0395e75cf517538

h2 = 2d56e63211d68a9a

h3 = cd2e4f0e7f903a4b

h4 = 1fa53c41cf466fe4

h5 = 60119e4c4bc5e6c6

h6 = b895a38bba334ca3

h7 = 68b7beb95a22e694

The 256 bytes message (represented as a sequence

of bytes; byte 0 is first, byte 255 is last): 00

01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f

10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f

20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f

30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f

40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f

50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f

60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f

70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f

80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f

90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f

a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af

b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf

c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf

d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df

e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef

f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff

The SHA-512/256 hash value (before truncation):

h0 = 50a201a0449a0617

h1 = 43e22f5c48eff125

h2 = 3ef8380002ae5655

lOMoARcPSD| 59062190

359

Let t be an integer satisfying 0 < t < 512, t 256, t 384.

Encode l as a Big-Endian 1024 bit integer T.

For the truncation of SHA-512 to t bits, namely SHA512/t,

we define the initial state (initialization constants) as follows:

512

= SHA-512 (T) (3 )

In other words, IV

512/t

is the output of the SHA-512 compression function (C

512

hereafter) operating on the encoded number

As above, SHA-512/t would use the initialization constants as in Equation (3), with all the remaining computations

remaining the same as SHA-the standard 512. When the hashing has been completed, and a 512 bit result is obtained, the

truncated digest would be defined the lower t bits of that result.

The initialization constants for SHA512/t for t=256 are shown in Table 4, and a test vector example is provided in Figure

TABLE IV. SHA-512/T (T=256) INITIAL CONSTANTS

Figure 3. Computing the SHA-512 constants

V. SHA-512/T – TRUNCATION TO OTHER OUTPUT

LENGTHS

It is conceivable that supporting digests of many other lengths, less than 512, would be useful. A straightforward way to

standardize such truncations would be to use another distinct set of eight primes, as SHA-384 and the proposed SHA-512/256

do. Such an approach does not scale gracefully because the initialization constants are not naturally related to the desired digest

length. To avoid this situation we suggest another truncation method.

h3 = 2b8c57b60cce2f2e

h4 = 4e2280f2ad1c4d8a

lOMoARcPSD| 59062190

h5 = 688eb8b073694f88

h6 = ad4d5b4cbe93c8f4

h7 = 442d3a450787b415

The SHA512/256 hash value (256 bits):

h0 = 50a201a0449a0617

h1 = 43e22f5c48eff125

h2 = 3ef8380002ae5655

h3 = 2b8c57b60cce2f2e

SHA-512 of the same

message: h0 =

1e7b80bc8edc552c h1 =

8feeb2780e111477 h2 =

e5bc70465fac1a77 h3 =

b29b35980c3f0ce4 h4 =

a036a6c946203682 h5 =

4bd56801e62af7e9 h6 =

feba5c22ed8a5af8 h7 =

77bf7de117dcac6d

(different values are obtained)

Figure 4. A SHA-512/t for t=256; Example.

This construction is different the method used by NIST

for the SHA-384 truncation (and therefore different from the

truncation we proposed in Section 3). On the other hand, this

construction enjoys the following properties:

SHA-512/t (M) = SHA-

512 (T || M), where T is as

above, “||” denotes string concatenation, and M is

any bit string (message) whose length in bits does

not exceed 2

– 1024.

In particular, all of the values IV

512-t

are distinct if

512

is collision resistant.

If t and t are two distinct positive integers less than

512, then SHA-512/t (M) SHA-

512/t (M) for any

message M, since by SHA-

512’s collision resistant

they are unequal before truncation.

SHA-512/t is collision resistant, pre-

image resistant,

and second pre-image resistant if SHA-

512 also has

these properties.

VI. CONCLUSION

From our performance analysis, and experimentation on

64 bit Intel Architecture, we have shown that the cost of

implementing a SHA-512 algorithm delivers a 50%

performance improvement over similar implementations of

SHA-256. We also showed that the storage costs for

implementing SHA-512 can be reduced by adding a small

amount of one-off computation to compute the SHA-512

constants - which we believe will be useful for constrained

implementation environment.

In order for users to be able to distinguish between a

SHA-512 digest which has been truncated and a

SHA512/256 digest, we also offer new initialization

constants, analogous to those used in SHA-384. We also

follow the standard truncation method in the SHA standard

[2] which can be extended to truncations to other lengths by

choosing the next set of 8 primes.

When the NSA designed the SHA family of algorithms

their design rationale was never published (this is one of the

two major motivations for the SHA-3 competition; the other

one being Wang’s attack on SHA-1). To the best of our

knowledge, from observing the SHA standard, and in

particular, the method used for defining SHA-384, the actual

values of the initialization constants are immaterial. They

only need to be unique per hash function, and this is what we

used for our SHA-512/256 truncation.

So, in addition, we propose an alternative method to

define SHA-512/t, a truncation of SHA-512 to t bits long

digests (for any positive t not equal to 256 or 384). This

construction is very similar to the mechanism used in the

Skein proposal for SHA-3 [4].

In either case, given SHA-512’s performance on 64 bit

architectures, we believe that SHA-512/256 removes a

performance obstacle for the adoption of wider hash values,

and that a truncated version of SHA-512 to 256 bits is a

viable alternative to SHA-256, for 64 bit architectures.

The analysis offered in this paper can also impact the

NIST SHA-3 competition, where the performance

requirement from the candidates is to be faster than SHA256

on a Intel® Core™2 Duo processor. Being 64bit architecture,

it favors the performance of SHA-512/256 by a

significant margin, and if this truncation is standardized,

then it would be reasonable to compare SHA-3 candidates

performance to the performance of SHA-512/256.

REFERENCES

[1] NIST, “NIST Brief Comments on Recent Cryptanalytic

Attacks on Secure Hashing Functions and Continued Security

Provided by SHA-1”, 25th August 2004,

http://csrc.nist.gov/groups/ST/toolkit/documents/shs/hash_sta

ndards_comments.pdf

[2] Federal Information Processing Standards Publication 180-3,

“SECURE HASH STANDARD”, October 2008,

http://csrc.nist.gov/publications/fips/fips180-3/fips180-

3_final.pdf

[3] OpenSSL Source Code, http://www.openssl.org/source

[4] N. Ferguson et al, “The Skein Hash Function Family”,

Version 1.3, 1st October 2010,

http://www.schneier.com/skein1.3.pdf

Corresponding author: Shay Gueron

Bấm Tải xuống để xem toàn bộ.

Preview text:

lOMoAR cPSD| 59062190
2011 Eighth International Conference on Information Technology: New Generations SHA-512/256 Shay Gueron Simon Johnson Jesse Walker Department of Mathematics Intel Architecture Group,
Security Research Lab, Intel Labs, University of Haifa Intel Corporation Intel Corporation Haifa, Israel USA USA shay@math.haifa.ac.il simon.p.johnson@intel.com jesse.walker@intel.com
However storing a SHA-512 bit hash is expensive, Intel Architecture Group,
especially in a constrained hardware environment, such as Microprocessor and Chipset
a state-of-the-art processor. SHA-384 does reduce this Development
storage requirement somewhat by truncating the final Intel Corporation, Israel
result of a SHA-512 to 384 bits. But by truncating the Development Center
result of SHA512 operation to 256 bits it is possible to Haifa, Israel
balance the cost of providing the necessary additional shay.gueron@intel.com
security/storage against the performance cost of calculating the hash.
Abstract—With the emergence of pervasive 64 bit computing
We believe that adding SHA-512/256 to the SHA
we observe that it is more cost effective to compute a SHA-
portfolio would provide implementers with performance/cost
512 than it is to compute a SHA-256 over a given size of data.
characteristics hitherto unavailable to them.
We propose a standard way to use SHA-512 and truncate its
output to 256 bits. For 64 bit architectures, this would yield
II. PERFORMANCE OF SHA-512 AND SHA-256
a more efficient 256 bit hashing algorithm, than the current
The performance of SHA-256 and SHA-512 depends on
SHA-256. We call this method SHA-512/256. We also provide
the length of the hashed message. Here we provide a a summary.
method for reducing the size of the SHA-512 constants table
Generally, SHA-256 and SHA-512 can be viewed as a
that an implementation will need to store.
single invocation of an _init() function (that initializes the
eight 64-bit variables h0, h1, h2, h3, h4, h5, h6, h7), followed
Keywords-hash algorithms, SHA-512.
by a sequence of invocations of an _update() function, and
an invocation a _finalize() function. I. INTRODUCTION
The _finalize() function itself consists of one or two
Robust and fast security functionality is basic tenant
invocations of _update(), depending on the message’s length.
for secure computer transactions. Hashing algorithms
In addition, there are some operations to create a formatted
have long been the poor-man of the community, with their
“last block(s)” (also called “padding”).
security receiving less attention than standard encryption
The _update() functions for SHA-256 and SHA-512 are
algorithms and with little attention paid to their speed. The
different and, even more importantly, operate on different
attacks against SHA-1 reversed this situation and there are
block sizes: 64 bytes for SHA-256 and 128 bytes for
many new proposals being evaluated in response to the SHA512.
NIST SHA-3 competition. In the aftermath of the SHA-1
From a performance standpoint the contribution of the
attacks the advice NIST produced was to move to SHA-
_init() function and the last block padding are negligible.
256 [1]. As a result, many standards and products have
Therefore, the performance of SHA-256 and SHA-512 can
started to move towards larger hash sizes, although this
be quite accurately approximated from the performance of
may be a somewhat protracted process as the SHA-3
their respective _update() functions, and the number of
competition now adds additional dimensions to feature invocations.
selection and future supportability issues. This movement
The number invocations of the _update() function
does not come without its costs as SHA-256 is about 2.2
depends on the message length as follows. times slower than SHA-1.
The reason why SHA-512 is faster than SHA-256 on SHA-256:
64bit machines is that has 37.5% less rounds per byte (80
Let M be a message of x bytes, where: x
rounds operating on 128 byte blocks) compared to SHA- = 64n + r, 0≤m<64.
256 (64 rounds operating on 64 byte blocks), where the
If r ≤ 55, the number of calls to _update() is (n+1)
operations use 64 bit integer arithmetic. The adoption
If r > 55, the number of calls to _update() is (n+2)
across the breadth of our product range of 64 bit ALU’s
Denote n = floor (x/64), r = x mod 64, and the cost
make it possible to achieve better security using SHA-512
(in CPU cycles) of one SHA-256 _update()
in less time than it takes to compute a SHA-256 hash. function by
UPDATE256. The number of cycles for computing the
978-0-7695-4367-3/11 $26.00 ' 2011 IEEE DOI 10.1109/ITNG.2011.69 lOMoAR cPSD| 59062190
SHA-256 of M is approximated by
As an example for optimized code (unrolled assembler),
we used OpenSSL version 1.0.0a [3]. For “compact” code,
UPDATE256 (n + 1 + floor(r/55)). (1)
we used C code we developed ourselves. SHA-512:
Let M be a message of y bytes, where: TABLE I. SHA-256 PERFORMANCE y = 128m + s, 0≤s<128. Cycles
If s ≤ 111, the number of calls to _update() is (m+1) Total Per Byte
If s > 111, the number of calls to _update() is (m+2) SHA-256 Update (Compact C code) 1,863 29.11
Denote m = floor (x/64), s = y mod 64, and the cost (in SHA-256 Update (OpenSSL unrolled asm code) 1,166 18.22
CPU cycles) of one SHA-512 _update() function by
UPDATE512. The number of cycles for computing the
SHA-256 of a 1024 bytes message (Compact C code) 33,757 32.97
SHA-512 of M is approximated by
SHA-256 of a 1024 bytes message (OpenSSL unrolled asm code) 19,769 19.30
UPDATE512(m + 1 + floor(s/ 111)) (2)
As an example, Figure 1 shows the SHA-512 flow of such TABLE II. SHA-512 PERFORMANCE
a message, and it also illustrates why the overheads beyond Cycles
the _update() invocations are negligible with respect to Total Per Byte performance. SHA-512 Update (Compact C code) 2,473 19.32 SHA-512 Update
Input: a pointer to the hash string (8 * 64bit long
words), a pointer to the message whose byte length (OpenSSL unrolled asm code) 1,483 11.58 is a multiple of 128. SHA-512 on 1024 bytes message 20,928 20.43 (Compact C code)
Output: The hash string holding the SHA-512 digest SHA-512 on 1024 bytes message of the message. (OpenSSL unrolled asm code) 13,392 13.07 Prototype:
void SHA-512_128byte_blocks(uint64_t hash[8],
uint8_t msg[256], int byte_length)
In both examples we see that unrolling the code to Flow:
carefully take advantage of the parallelism in the CPU micro- SHA-512Init(hash) last_block = zero_string
architecture, results in a performance improvement of ~37% last_block[byte 0] = 0x80
(for SHA-256) and 37-40% (for SHA-512). last_block[qword 15] =
To illustrate the accuracy of the performance
big_endian(byte_length*8) append(msg,
last_block) for i=0 to byte_length/128 SHA-
approximations, apply Equation (2) to a 1024 byte message
512Update(hash, msg) msg = msg+128 end for
(m=8, s=0), with UPDATE512 = 1483 (Table 2, unrolled
code). The approximated total cycles count for SHA-512 is
Output: The hash now holds the digest of the
13,347 cycles, which indeed closely approximates the message
actually measured 13,392 cycles. The small differences can
Figure 1. SHA-512 of a message whose length is a multiple of 128 bytes be attributed to overheads. (pseudo code)
When comparing apples-to-apples implementations,
SHA-512 performs ~50% more efficiently SHA-256. Even
For comparison purposes we show the performance, in
when comparing “best” against “worst” implementations, the
total cycles per block and in CPU cycles per byte, of the
worst SHA-512 performance is within ~6% of best SHA-256
_update() functions for both SHA-256 and SHA-512, implementation.
measured on the 2010 Intel® Core™ architecture, Xeon
X5670 processor. We also show the total number of CPU
III. THE SHA-512/256 TRUNCATION
cycles required for hashing a 1024 bytes message. The
In this section we will show how to truncate SHA-512 to
reported measurements were carried out on an Intel Xeon
256 bits. The result of this process we refer to as
X5670 processor running at 2.67 GHz. The operating system SHA512/256.
was Linux (OpenSuse 11.1 64 bits). To isolate the
SHA-384 [2] already provides an existing example for
performance of the functions that we measured, we disabled
truncation of SHA-512 to a shorter digest size. The
Intel® Turbo Boost Technology, Intel® Hyper-Threading
computations of SHA-384 are exactly the same as SHA-512,
Technology, and Enhanced Intel Speedstep® Technology. No
and in order to signify that a hash was performed by a
X server and no network daemon were running. The results
truncated form of SHA-512, the initial hash values are set to are shown in Tables 1 and 2.
different constants. In other words, the only difference 355 lOMoAR cPSD| 59062190
between SHA-512 and SHA-384 is in the _init() function,
The 256 bytes message (represented as a sequence
and of course the truncation itself.
of bytes; byte 0 is first, byte 255 is last): 00
We propose to apply the same technique to the truncation
01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
of SHA-512 to a 256 bit digest.
20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f
In SHA-512 the init() function sets the initial state to the
30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f
first 64 bits of the fractional parts of the square roots of the
40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f
first 8 prime numbers. In SHA-384 these constants are
50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f
replaced with the fractional parts of the square roots of the
60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f
ninth through sixteenth prime numbers.
70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f
80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f
By analogy, we propose that the initialization constants
90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f
for SHA-512/256 would be the fractional parts of the
a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af
seventeenth through twenty-fourth prime numbers. These
b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
values are shown in Table 3. When the hashing has been
c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df
completed, and a 512 bit result is obtained, the truncated
e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef
digest would be defined the lower 256 bits of that result.
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff TABLE III.
SHA-512/256 PROPOSED INITIAL CONSTANTS
The SHA-512/256 hash value (before truncation): h0 = 4ff7ecb3e7c23b55 Prime Prime
The first 64bits of the fractional h1 = 9974eba17a3d1a62 Number Value
part of the square roots of the primes h2 = 0504f18be2e472ea h3 = c5c5cbf75b3b7550 h0 17 59 ae5f9156e7b6d99b h4 = 27a8af7dc7dc9845 h1 18 61 cf6c85d39d1a1e15 h5 = 8cfc76997dc50cfd h2 19 67 2f73477d6a4563ca h6 = f4f500cc1830f561 h7 = bf2abd3732fdf66a h3 20 71 6d1826cafd82e1ed h4 21 73 8b43d4570a51b936
The SHA512/256 hash value (256 bits): h5 22 79 e360b596dc380c3f h0 = 4ff7ecb3e7c23b55 h6 23 83 1c456002ce13e9f8 h1 = 9974eba17a3d1a62 h2 = 0504f18be2e472ea h7 24 89 6f19633143a0af0e h3 = c5c5cbf75b3b7550
Figure 2 provides a test vector example.
For comparison, the SHA-512 hash value of the same message is: h0 = 1e7b80bc8edc552c h1 = 8feeb2780e111477 h2 = e5bc70465fac1a77 h3 = b29b35980c3f0ce4 h4 = a036a6c946203682 h5 = 4bd56801e62af7e9 h6 = feba5c22ed8a5af8 h7 = 77bf7de117dcac6d
(note: completely different values are obtained)
Figure 2. A SHA-512/256 example. IV.
CALCULATING THE SHA-512 CONSTANTS WITH A SMALLER LOOKUP TABLE
The downside of implementing SHA-512 is that it
requires a table of eighty 64-bit constants (a 640 bytes lookup
table). For comparison, SHA-256 requires only sixty four 32
bit constant (a 256 bytes lookup table).
In some implementations the cost of storing data for the
lookup table can be exceptionally high. In such cases, the
storage requirement of SHA-512, namely for 384 more bytes
than SHA-256, would be considered as a disadvantage.
For SHA-256, there is a way to reduce the storage
requirement by computing the sixty four constants which are
defined to be the first 32 bits of the fractional parts of the
cube roots of the first sixty four prime numbers. One way to
compute these cube roots is to use Newton-Raphson
iterations, which, on 64 bit architectures, quickly converge to
provide the first 32 bits of the result. lOMoAR cPSD| 59062190
Unfortunately, this is not the case for SHA-512, because
and so on. Since up to the first eighty primes, the largest
the SHA-512 constants are defined to be the first 64 bits of
difference is 24, it follows that all differences can be
the fractional part of the cube roots of the first eighty primes.
represented by 4 bits, so that pairs of differences can be
Performing simple numerical iterations on 64 bit architecture stored in a single byte.
does not give the required precision.
Altogether, this method uses only 2.5 bytes for each
We propose the following method for obtaining the SHA-
constant (instead of 8 bytes if the constants are stored),
512 constants. They can be approximated, using the Newton-
therefore, reducing the table size from 640 bytes to only 200
Raphson Algorithm, up to the last two bytes (in no more than
bytes. This method trades a reduced table size with the small
14 iterations of the algorithm). Therefore, it is sufficient to
cost of additional code and computations. The
store, for each constant, only the two bytes (precomputed)
implementation and the associated constants are detailed in
difference between the result of the NewtonRaphson Figure 3 below.
iterations and the exact constant. To avoid storing the first
eighty primes, it is enough to store only the difference from
Input: Pointers to three arrays.
the previous prime number. For example, the difference from
Array 1 holds the differences between the first
the first prime is (2) is 0, the difference of the next prime (3)
80 primes (two differences per byte).
from the previous one (2) is 1, the next difference is 5-3=2
Array 2 holds the difference between the result 357 lOMoAR cPSD| 59062190
of the Newton-Raphson iterations and the desired constant.
Array 3 a place to store the computed 80 SHA-512 constants. Output: The SHA-512 constants Prototype:
void calculate_SHA-512_constants (uint16_t
deltas[80], uint8_t offsets[40], uint64_t K[80]) Constants: offsets =
0x01, 0x22, 0x42, 0x42, 0x46, 0x26, 0x42,
0x46, 0x62, 0x64, 0x26, 0x46, 0x84, 0x24,
0x24, 0xe4, 0x62, 0xa2, 0x66, 0x46, 0x62,
0xa2, 0x42, 0xcc, 0x42, 0x46, 0x2a, 0x66,
0x62, 0x64, 0x2a, 0xe4, 0x24, 0xe6, 0xa2, 0x46, 0x86, 0x64, 0x68, 0x48 Deltas =
0xfe22, 0x05cd, 0xfb2f, 0xfbbc, 0xf538,
0xf019, 0xef9b, 0x0118, 0x0242, 0x0fbe,
0xf28c, 0xf4e2, 0x096f, 0xf6b1, 0xf235, 0x0694, 0x0ad2, 0x05e3, 0 L x en1g5tb h 5, 0x1c65, 256
0x0275, 0xe483, 0xfbd4, 0x13b5, 0xdfab, 000100000000000000000000
0xf210, 0xe13f, 0x0ee4, 0x0fc2, 0xe725, 000000000000000000000000
0x026f, 0xee70, 0xeffc, 0x0926, 0xeaed, 000000000000000000000000
0xf3df, 0xe3de, 0xf2a8, 0xeee6, 0xf53b,
Length T, encoded as a 000000000000000000000000
0x0364, 0xf001, 0x1791, 0xfe30, 0x1218, 1024 bit (Big-Endian)
0x2910, 0xe02a, 0xd1b8, 0x10c8, 0xeb53, 000000000000000000000000 number.
0xeb99, 0x08a8, 0x1a63, 0x0acb, 0xe373, Here, T=256. 000000000000000000000000
0xf8a3, 0xf2fc, 0xef60, 0x2b72, 0xf9ec,
(byte 0 is first, byte 127 is the 000000000000000000000000
0x1e28, 0xfde9, 0xf915, 0x132b, 0xe19c, last) 000000000000000000000000
0x0207, 0xeb1e, 0x1178, 0xefba, 0x18a6, 000000000000000000000000
0x0dae, 0x071b, 0xfd84, 0x2493, 0xfebc, 000000000000000000000000
0x0d4c, 0x02b6, 0xfe2a, 0xfaec, 0x1817 0000000000000000 h0 = 2b2b0a74439fba29 Flow: h1 = b0395e75cf517538 double p = 2 //the firs T t he
initial constants IV512/t h2 = 2d56e63211d68a9a prime for i=0 to 79 if (t=256) h3 = cd2e4f0e7f903a4b (i%2 = 1) //offset is in t(h = e S s H e A c
-5o1n2d ( Tn))i bble h 4 = 1fa53c41cf466fe4 p = p + (offsets[i/2] & 0 x0f) else h5 = 60119e4c4bc5e6c6 //offset in the first h6 = b895a38bba334ca3
nibble p = p + The 256 bytes messag h e 7 ( = r e 6 p 8 r b e 7 s b e e n b t 9 e 5 d a 2 a 2 s e 6 a 9 4 s e quence (offsets[i/2] >> 4) o f e b n y d t e i s f ; b y t
e 0 is first, byte 255 is last): 00 double n = p/3 for j= 0 0 1 t 0 o 2 1 0 3 3
04 05 06 07 08 09 0a 0b 0c 0d 0e 0f n = n - (n3-p)/3n2 10
11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f // correction to 64 b 2 i 0 t s 2 1 a c 2 c 2 u r 2 a 3 c y 2 4 2 5
26 27 28 29 2a 2b 2c 2d 2e 2f K[i] = fraction(n) * 264 + 3 0 d e 3 l 1 t a 3 s 2 [ i 3 ] 3
34 35 36 37 38 39 3a 3b 3c 3d 3e 3f end for
40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f Output:
50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f
60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f
The constants are now in the K array
70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f
80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f
90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f
a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af
b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df
e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff
The SHA-512/256 hash value (before truncation): h0 = 50a201a0449a0617 h1 = 43e22f5c48eff125 h2 = 3ef8380002ae5655 lOMoAR cPSD| 59062190
Let t be an integer satisfying 0 < t < 512, t 256, t 384. h3 = 2b8c57b60cce2f2e
Encode l as a Big-Endian 1024 bit integer T. h4 = 4e2280f2ad1c4d8a
For the truncation of SHA-512 to t bits, namely SHA512/t,
we define the initial state (initialization constants) as follows: IV512 /t = SHA-512 (T) (3 )
In other words, IV512/t is the output of the SHA-512 compression function (C512 hereafter) operating on the encoded number T.
As above, SHA-512/t would use the initialization constants as in Equation (3), with all the remaining computations
remaining the same as SHA-the standard 512. When the hashing has been completed, and a 512 bit result is obtained, the
truncated digest would be defined the lower t bits of that result.
The initialization constants for SHA512/t for t=256 are shown in Table 4, and a test vector example is provided in Figure 4. TABLE IV.
SHA-512/T (T=256) INITIAL CONSTANTS
Figure 3. Computing the SHA-512 constants
V. SHA-512/T – TRUNCATION TO OTHER OUTPUT LENGTHS
It is conceivable that supporting digests of many other lengths, less than 512, would be useful. A straightforward way to
standardize such truncations would be to use another distinct set of eight primes, as SHA-384 and the proposed SHA-512/256
do. Such an approach does not scale gracefully because the initialization constants are not naturally related to the desired digest
length. To avoid this situation we suggest another truncation method. 359 lOMoAR cPSD| 59062190 h5 = 688eb8b073694f88
When the NSA designed the SHA family of algorithms h6 = ad4d5b4cbe93c8f4
their design rationale was never published (this is one of the h7 = 442d3a450787b415
two major motivations for the SHA-3 competition; the other
one being Wang’s attack on SHA-1). To the best of our
The SHA512/256 hash value (256 bits): h0 = 50a201a0449a0617
knowledge, from observing the SHA standard, and in h1 = 43e22f5c48eff125
particular, the method used for defining SHA-384, the actual h2 = 3ef8380002ae5655
values of the initialization constants are immaterial. They h3 = 2b8c57b60cce2f2e
only need to be unique per hash function, and this is what we
used for our SHA-512/256 truncation. SHA-512 of the same message: h0 =
So, in addition, we propose an alternative method to 1e7b80bc8edc552c h1 =
define SHA-512/t, a truncation of SHA-512 to t bits long 8feeb2780e111477 h2 =
digests (for any positive t not equal to 256 or 384). This e5bc70465fac1a77 h3 = b29b35980c3f0ce4 h4 =
construction is very similar to the mechanism used in the a036a6c946203682 h5 = Skein proposal for SHA-3 [4]. 4bd56801e62af7e9 h6 =
In either case, given SHA-512’s performance on 64 bit feba5c22ed8a5af8 h7 =
architectures, we believe that SHA-512/256 removes a 77bf7de117dcac6d
(different values are obtained)
performance obstacle for the adoption of wider hash values,
Figure 4. A SHA-512/t for t=256; Example.
and that a truncated version of SHA-512 to 256 bits is a
viable alternative to SHA-256, for 64 bit architectures.
This construction is different the method used by NIST
The analysis offered in this paper can also impact the
for the SHA-384 truncation (and therefore different from the
NIST SHA-3 competition, where the performance
truncation we proposed in Section 3). On the other hand, this
requirement from the candidates is to be faster than SHA256
construction enjoys the following properties:
on a Intel® Core™2 Duo processor. Being 64bit architecture,
it favors the performance of SHA-512/256 by a
SHA-512/t (M) = SHA-512 (T | M), where T is as
significant margin, and if this truncation is standardized,
above, “| ” denotes string concatenation, and M is
then it would be reasonable to compare SHA-3 candidates
performance to the performance of SHA-512/256.
any bit string (message) whose length in bits does not exceed 264 – 1024.
In particular, all of the values IV512-t are distinct if REFERENCES C512 is collision resistant.
[1] NIST, “NIST Brief Comments on Recent Cryptanalytic
If t and t are two distinct positive integers less than
Attacks on Secure Hashing Functions and Continued Security
512, then SHA-512/t (M) SHA-512/t (M) for any Provided by SHA-1”, 25th August 2004,
message M, since by SHA-512’s collision resistant
http://csrc.nist.gov/groups/ST/toolkit/documents/shs/hash_sta
they are unequal before truncation. ndards_comments.pdf
SHA-512/t is collision resistant, pre-image resistant,
[2] Federal Information Processing Standards Publication 180-3,
“SECURE HASH STANDARD”, October 2008,
and second pre-image resistant if SHA-512 also has
http://csrc.nist.gov/publications/fips/fips180-3/fips180- these properties. 3_final.pdf VI. CONCLUSION
[3] OpenSSL Source Code, http://www.openssl.org/source
From our performance analysis, and experimentation on
[4] N. Ferguson et al, “The Skein Hash Function Family”,
64 bit Intel Architecture, we have shown that the cost of Version 1.3, 1st October 2010,
implementing a SHA-512 algorithm delivers a 50%
http://www.schneier.com/skein1.3.pdf
performance improvement over similar implementations of
SHA-256. We also showed that the storage costs for
implementing SHA-512 can be reduced by adding a small
amount of one-off computation to compute the SHA-512
Corresponding author: Shay Gueron
constants - which we believe will be useful for constrained implementation environment.
In order for users to be able to distinguish between a
SHA-512 digest which has been truncated and a
SHA512/256 digest, we also offer new initialization
constants, analogous to those used in SHA-384. We also
follow the standard truncation method in the SHA standard
[2] which can be extended to truncations to other lengths by
choosing the next set of 8 primes.

Performance analysis and truncation methodologies | Đại Học Nội Vụ Hà Nội

Tài liệu liên quan:

Based Decomposition and Classification | Đại Học Nội Vụ Hà Nội

Chapter 1: Artificial Intelligence | Đại Học Nội Vụ Hà Nội

Chapter 2: Intelligent Agents | Đại Học Nội Vụ Hà Nội

Chapter 4a: Informed search algorithms | Đại Học Nội Vụ Hà Nội

Chapter 4b: Local search algorithms | Đại Học Nội Vụ Hà Nội