Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

Recent questions tagged cuda

0 votes
593 views
1 answer
    when is calling to the cudaDeviceSynchronize function really needed?. As far as I understand from the CUDA ... the program so much? See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
590 views
1 answer
    I am running CUFFT on chunks (N*N/p) divided in multiple GPUs, and I have a question regarding calculating ... the performance of FFT? Thanks. See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
420 views
1 answer
    I am trying to modify the imageDenosing class in CUDA SDK, I need to repeat the filter many time incase ... lead to many synchronise problem See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
674 views
1 answer
    While using texture memory I have come across the following code:- uint f = (blockIdx.x * blockDim.x) + ... f? This confuses me.. thankyou See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
509 views
1 answer
    Is it possible to access hard disk/ flash disk directly from GPU (CUDA/openCL) and load/store content ... any suggestions about the design. See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
600 views
1 answer
    I have following thrust::transform call. my_functor *f_1 = new my_functor(); thrust::transform(data.begin(), ... what are these other kernels? See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
835 views
1 answer
    I hope you can help me to figure out the correct compiler option required for the below card: > ./ ... quantitatively this speed up? Thanks See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
636 views
1 answer
    I need the compute the element wise multiplication of two vectors (Hadamard product) of complex numbers with NVidia ... possible with CUBLAS). See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
577 views
1 answer
    I am currently using CUDA 7.5 under VS 2013. Today I needed to remove some of the elements from a ... , is there any thing special? See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
500 views
1 answer
    I am writing a code to compute dot product of two vectors using CUBLAS routine of dot product but it returns the ... copy from CPU to GPGPU? See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
511 views
1 answer
    In cuBLAS, cublasIsamin() gives the argmin for a single-precision array. Here's the full function declaration: ... to device memory instead. See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
811 views
1 answer
    I have a C project in Cmake in which I have embedded cuda kernel module. I want to pass --ptxas-options=-v ... options=-v to my nvcc compiler ? See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
755 views
1 answer
    I'm new in CUDA, I appreciate your help and hope you can help me. I need to store multiple elements of ... (dev_matrix); cudaFree(dev_vector); } See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
658 views
1 answer
    A follow up Q to: EarlyExit and DroppedThreads According to the above links, the code below should dead-lock. Please ... ) result= add[0]; } See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
655 views
1 answer
    As stated by other questions and according to the link, you can ... .html#group__CUDART__MEMORY_1g9bcf02b53644eee2bef9983d807084c7 See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
446 views
1 answer
    If I perform a float (single precision) operation on a Host and a Device (GPU arch sm_13) , then will the values be different ? See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
1.1k views
1 answer
    The problem is simple: I have two matrices, A and B, that are M by N, where M >> N. I want to first take ... ? What about lda, ldb, ldc? Thanks! See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
545 views
1 answer
    Has anyone started to work with the CUDA5 SDK? I have an old project that uses some cutil functions, but ... sdkCreateTimer Just that simple... See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
549 views
1 answer
    I've looked all around this site and others, and nothing has worked. I'm resorting to posting a question for my ... other things ... } Thanks! See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
635 views
1 answer
    I have a thrust device_vector. I want to cast it to a raw pointer so that I can pass it to a kernel. How can I ... kernel<<<bl,tpb>>>(pass raw) See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
1.3k views
1 answer
    I got a problem when I try to compile a simple code there are C++ and Cuda code compile in a separated way. ... Hat card: K20x Any idea? Thanks See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
456 views
1 answer
    I have b number of blocks and each block has t number of threads. I can use __syncthreads() to synchronize the ... blocks. How can I do this? See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
701 views
1 answer
    In CUDA we can use pinned memory to more efficiently copy the data from Host to GPU than the default ... is a better programming practice. See Question&Answers more detail:os...
asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
344 views
1 answer
    I was writing a simple memcpy kernel to meassure the memory bandwith of my GTX 760M and to compare it to ... independent of all other elements? See Question&Answers more detail:os...
asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
401 views
1 answer
    I am trying to run a matrix inversion from the device. This logic works fine if called from the host. ... memory access inside the kernel. See Question&Answers more detail:os...
asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
675 views
1 answer
    Consider the following code: __global__ void kernel(int *something) { extern __shared__ int shared_array[]; // Some ... cell in some thread? See Question&Answers more detail:os...
asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
730 views
1 answer
    I have a question about branch predication in GPUs. As far as I know, in GPUs, they do predication with branches. ... memory, and so on? Thanks See Question&Answers more detail:os...
asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)
0 votes
605 views
1 answer
    I have data stored as arrays of floats (single precision). I have one array for my real data, and one array ... it is stored in memory. Thanks! See Question&Answers more detail:os...
asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)
Ask a question:
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...