Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
602 views
in Technique[技术] by (71.8m points)

performance - How slow (how many cycles) is calculating a square root?

How slow (how many cycles) is calculating a square root? This came up in a molecular dynamics course where efficiency is important and taking unnecessary square roots had a noticeable impact on the running time of the algorithms.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

From Agner Fog's Instruction Tables:

On Core2 65nm, FSQRT takes 9 to 69 cc's (with almost equal reciprocal throughput), depending on the value and precision bits. For comparison, FDIV takes 9 to 38 cc's (with almost equal reciprocal throughput), FMUL takes 5 (recipthroughput = 2) and FADD takes 3 (recipthroughput = 1). SSE performance is about equal, but looks faster because it can't do 80bit math. SSE has a super fast approximate reciprocal and approximate reciprocal sqrt though.

On Core2 45nm, division and square root got faster; FSQRT takes 6 to 20 cc's, FDIV takes 6 to 21 cc's, FADD and FMUL haven't changed. Once again SSE performance is about the same.

You can get the documents with this information from his website.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...