Thursday, 15 April 2010

floating point - Is it possible to omit rounding of intermediate results during arithmetic operation on multiple FP operands? -



floating point - Is it possible to omit rounding of intermediate results during arithmetic operation on multiple FP operands? -

is there possibility arithmetic operation on multiple floating point operands without rounding intermediate results , round final result, , there architectures doing it? because far i've seen after 2 floating point operands used in addition/subtraction operation result gets rounded before beingness used operand operation, i've seen this.

edit:

below examples considering single-precision format elucidate concept, 3 to the lowest degree important bits of intermediate 27-bit mantissas taking part in arithmetic operation guard, round , sticky bits; examples can see, using intermediate mantissa bit construction utilized in ieee754 compliant fp system, avoiding rounding of intermediate values possible , when it's done it'll accomplish more accurate result:

1_example 1:

a-b = 101001000010001110110100 1 0 1×2^exp

c = 100011001011001010010100×2^(exp-4) --> c = 000010001100101100101001 0 1 0×2^exp

1_1_if a-b-c calculated after a-b rounded:

rounded a-b = 101001000010001110110101×2^exp

a-b-c = 100110110101100010001011 1 1 0×2^exp

rounded a-b-c=100110110101100010001100×2^exp

1_2_if a-b-c calculated without a-b beingness rounded:

a-b-c=100110110101100010001011 0 1 1×2^exp

rounded a-b-c = 100110110101100010001011×2^exp

2_example 2:

a-b = 100001001100101011100000 1 0 1×2^exp

c = 101001011010001110001000×2^(exp-5) --> c = 000001010010110100011100 0 1 0×2^exp

2_1_if a-b-c calculated after a-b rounded:

rounded a-b = 100001001100101011100001×2^exp

a-b-c = 011111111001110111000100 1 1 0×2^exp

a-b-c shifted 1 bit left = 111111110011101110001001 1 0 0×2^(exp-1)

rounded a-b-c=11111111001110111000101×2^(exp-1)

2_2_if a-b-c calculated without a-b beingness rounded:

a-b-c=011111111001110111000100 0 1 1×2^exp

a-b-c shifted 1 bit left = 111111110011101110001000 1 1 0×2^(exp-1)

rounded a-b-c = 111111110011101110001001×2^(exp-1)

3_example 3:

a-b = 10000001101000110101001 1 1 1×2^exp

c = 100010100101011010010000×2^(exp-6) --> c = 000000100010100101011010 0 1 0×2^exp

3_1_if a-b-c calculated after a-b rounded:

rounded a-b = 10000001101000110101010×2^exp

a-b-c = 01111101010100001001111 1 1 0×2^exp

a-b-c shifted 1 bit left = 11111010101000010011111 1 0 0×2^(exp-1)

rounded a-b-c=11111010101000010100000×2^(exp-1)

3_2_if a-b-c calculated without a-b beingness rounded:

a-b-c = 01111101010100001001111 1 0 1×2^exp

a-b-c shifted 1 bit left = 11111010101000010011111 0 1 0×2^(exp-1)

rounded a-b-c=11111010101000010011111×2^(exp-1)

4_example 4:

a-b = 101100101000111000110101 0 1 1×2^exp

c = 100110010110011101100000×2^(exp-7) --> c = 000000010011001011001110 1 1 0×2^exp

4_1_if a-b-c calculated after a-b rounded:

rounded a-b = 101100101000111000110110×2^exp

a-b-c = 101100010101101101100111 0 1 0×2^exp

rounded a-b-c=101100010101101101100111×2^exp

4_2_if a-b-c calculated without a-b beingness rounded:

a-b-c=101100010101101101100111 1 0 1×2^exp

rounded a-b-c = 101100010101101101101000×2^exp

5_example 5:

a-b = 100000111011001111001010 0 1 1×2^exp

c = 110001011010010110010110×2^(exp-3) --> c = 000110001011010010110010 1 1 0×2^exp

5_1_if a-b-c calculated after a-b rounded:

rounded a-b = 100000111011001111001010×2^exp

a-b-c = 011010101111111100010111 0 1 0×2^exp

a-b-c shifted 1 bit left = 110101011111111000101110 1 0 0×2^(exp-1)

rounded a-b-c=110101011111111000101110×2^(exp-1)

5_2_if a-b-c calculated without a-b beingness rounded:

a-b-c = 011010101111111100010111 1 0 1×2^exp

a-b-c shifted 1 bit left = 110101011111111000101111 0 1 0×2^(exp-1)

rounded a-b-c=110101011111111000101111×2^(exp-1)

6_example 6:

a-b = 100000000011000111001010 0 0 1×2^exp

c = 110010001100110111000000×2^(exp-8) --> c = 000000001100100011001101 1 1 0×2^exp

6_1_if a-b-c calculated after a-b rounded:

rounded a-b = 100000000011000111001010×2^exp

a-b-c = 011111110110100011111100 0 1 0×2^exp

a-b-c shifted 1 bit left = 111111101101000111111000 1 0 0×2^(exp-1)

rounded a-b-c=111111101101000111111000×2^(exp-1)

6_2_if a-b-c calculated without a-b beingness rounded:

a-b-c = 011111110110100011111100 0 1 1×2^exp

a-b-c shifted 1 bit left = 111111101101000111111000 1 1 0×2^(exp-1)

rounded a-b-c=111111101101000111111001×2^(exp-1)

according paper referenced in question, possible calculate dot product of pair of length n vectors single rounding operation @ end, getting closest representable result dot product.

in practice, current computers round intermediate results, results in reply not closest representable. @ best, rounding extended format, tend reduce, not eliminate, intermediate result rounding error.

with fused multiply-add rounding done n times, 1 time each multiply-add. without fused multiply-add done twice each pair, 1 time after multiply , 1 time again after add.

the bibtex info paper is:

@inproceedings{ yao:correctly, author = "tao yao , deyuan gao , xiaoya fan , jari nurmi", title = "correctly rounded architectures floating-point multi-operand add-on , dot-product computation", booktitle = "asap'13", pages = {346-355}, year = {2013}, }

floating-point computer-science computer-architecture floating-point-precision

No comments:

Post a Comment