Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
512 views
in Technique[技术] by (71.8m points)

c - How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?

Consider that you want to calculate the low 128-bits of the result of multiplying a 64-bit and 128-bit unsigned number, and that the largest multiplication you have available is the C-like 64-bit multiplication which takes two 64-bit unsigned inputs and returns the low 64-bits of the result.

How many multiplications are needed?

Certainly you can do it with eight: break all the inputs up into 32-bit chunks and use your 64-bit multiplication to do the 4 * 2 = 8 required full-width 32*32->64 multiplications, but can one do better?

Of course the algorithm should do only a "reasonable" number of additions or other basic arithmetic on top of the multiplications (I'm not interested in solutions that re-invent multiplication as an addition loop and hence claim "zero" multiplications).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Four, but it starts to get a little tricky.

Let a and b be the numbers to be multiplied, with a0 and a1 being the low and high 32 bits of a, respectively, and b0, b1, b2, b3 being 32-bit groups of b, from low to high respectively.

The desired result is the remainder of (a0 + a1?232) ? (b0 + b1?232 + b2?264 + b3?296) modulo 2128.

We can rewrite that as (a0 + a1?232) ? (b0 + b1?232) + (a0 + a1?232) ? (b2?264 + b3?296) modulo 2128.

The remainder of the latter term modulo 2128 can be computed as a single 64-bit by 64-bit multiplication (whose result is implicitly multiplied by 264).

Then the former term can be computed with three multiplications using a carefully implemented Karatsuba step. The simple version would involve a 33-bit by 33-bit to 66-bit product which is not available, but there is a trickier version that avoids it:

z0 = a0 * b0
z2 = a1 * b1
z1 = abs(a0 - a1) * abs(b0 - b1) * sgn(a0 - a1) * sgn(b1 - b0) + z0 + z2

The last line contains only one multiplication; the other two pseudo-multiplications are just conditional negations. Absolute-difference and conditional-negate are annoying to implement in pure C, but it could be done.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...