python - Different behavior of tensors: debugging vs normal execution on gpu

Question

Welcome To Ask or Share your Answers For Others

python - Different behavior of tensors: debugging vs normal execution on gpu

asked Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Different behavior of tensors: debugging vs normal execution on gpu

So I have a program (that I didn't write) running on a GPU that is being parallelized with CUDA v10.2. Now there's the following block of code:

1  ctrs = []
2  for icls, cls_id in enumerate(pred_cls_ids):
3      cls_msk = (mask == cls_id)
4      ms = MeanShiftTorch(bandwidth=radius)
5      ctr, ctr_labels = ms.fit(pred_ctr[cls_msk, :])
6      ctrs.append(ctr.detach().contiguous().cpu().numpy())
7      ctrs = torch.from_numpy(np.array(ctrs).astype(np.float32)).cuda()
8  print(ctrs.size())
9  n_ctrs, _ = ctrs.size()

When running this I get the following error:

  File "/media/datasets/benjamin/repos/PVN3D/pvn3d/lib/utils/pvn3d_eval_utils.py", line 60, in cal_frame_poses
    n_ctrs, _ = ctrs.size()
ValueError: not enough values to unpack (expected 2, got 1)

which makes sense with the output of the print statement in line 8 being:

torch.Size([0])

A few resources on the internet talking about tensors and tensor.shape/tensor.size() suggest casting the tensor.size() to a list, meaning in my case I do the following:

8  print(list(ctrs.size()))
9  n_ctrs, _ = list(ctrs.size())

This produces the same error though:

File "/media/datasets/benjamin/repos/PVN3D/pvn3d/lib/utils/pvn3d_eval_utils.py", line 60, in cal_frame_poses
    n_ctrs, _ = list(ctrs.size())
ValueError: not enough values to unpack (expected 2, got 1)

with the output of line 8 now being

[0]

If I understand that right the problem is not the type of my output, but I need to "assemble" my parallelized tensors to get the actual size of ctrs. I come to this conclusion, because I can run this program while debugging without any value errors of that sort. the output of the print statement while debugging is (for example):

[2, 3]

Is there any change that I can make to that code to get the correct size of the data without doing changes that potentially alter behavior of the code coming after? Something like putting together those parallelized tensors for one line of code to get the size and then immediately parallelizing them again.

question from:https://stackoverflow.com/questions/66063723/different-behavior-of-tensors-debugging-vs-normal-execution-on-gpu

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

python - Different behavior of tensors: debugging vs normal execution on gpu

python - Different behavior of tensors: debugging vs normal execution on gpu

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags