cuda - Calculating performance of CUFFT

Question

Welcome To Ask or Share your Answers For Others

cuda - Calculating performance of CUFFT

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

cuda - Calculating performance of CUFFT

I am running CUFFT on chunks (N*N/p) divided in multiple GPUs, and I have a question regarding calculating the performance. First, a bit about how I am doing it:

Send N*N/p chunks to each GPU
Batched 1-D FFT for each row in p GPUs
Get N*N/p chunks back to host - perform transpose on the entire dataset
Ditto Step 1
Ditto Step 2

Gflops = ( 1e-9 * 5 * N * N *lg(N*N) ) / execution time

and Execution time is calculated as:

execution time = Sum(memcpyHtoD + kernel + memcpyDtoH times for row and col FFT for each GPU)

Is this the correct way to evaluate CUFFT performance on multiple GPUs? Is there any other way I could represent the performance of FFT?

Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:42:26+0000

If you are doing a complex transform, the operation count is correct (it should be 2.5 N log2(N) for a real valued transform), but the GFLOP formula is incorrect. In a parallel, multiprocessor operation the usual calculation of throughput is

operation count / wall clock time

In your case, presuming the GPUs are operating in parallel, either measure the wall clock time (ie. how long the whole operation took) for the execution time, or use this:

execution time = max(memcpyHtoD + kernel + memcpyDtoH times for row and col FFT for each GPU)

As it stands, your calculation represents the serial execution time. Allowing for the overheads from the multigpu scheme, I would expect that the calculated performance numbers you are getting will be lower than the equivalent transform done on a single GPU.

Categories

cuda - Calculating performance of CUFFT

cuda - Calculating performance of CUFFT

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags