So I have a program (that I didn't write) running on a GPU that is being parallelized with CUDA v10.2. Now there's the following block of code:
1 ctrs = []
2 for icls, cls_id in enumerate(pred_cls_ids):
3 cls_msk = (mask == cls_id)
4 ms = MeanShiftTorch(bandwidth=radius)
5 ctr, ctr_labels = ms.fit(pred_ctr[cls_msk, :])
6 ctrs.append(ctr.detach().contiguous().cpu().numpy())
7 ctrs = torch.from_numpy(np.array(ctrs).astype(np.float32)).cuda()
8 print(ctrs.size())
9 n_ctrs, _ = ctrs.size()
When running this I get the following error:
File "/media/datasets/benjamin/repos/PVN3D/pvn3d/lib/utils/pvn3d_eval_utils.py", line 60, in cal_frame_poses
n_ctrs, _ = ctrs.size()
ValueError: not enough values to unpack (expected 2, got 1)
which makes sense with the output of the print statement in line 8 being:
torch.Size([0])
A few resources on the internet talking about tensors and tensor.shape
/tensor.size()
suggest casting the tensor.size()
to a list, meaning in my case I do the following:
8 print(list(ctrs.size()))
9 n_ctrs, _ = list(ctrs.size())
This produces the same error though:
File "/media/datasets/benjamin/repos/PVN3D/pvn3d/lib/utils/pvn3d_eval_utils.py", line 60, in cal_frame_poses
n_ctrs, _ = list(ctrs.size())
ValueError: not enough values to unpack (expected 2, got 1)
with the output of line 8 now being
[0]
If I understand that right the problem is not the type of my output, but I need to "assemble" my parallelized tensors to get the actual size of ctrs
.
I come to this conclusion, because I can run this program while debugging without any value errors of that sort.
the output of the print statement while debugging is (for example):
[2, 3]
Is there any change that I can make to that code to get the correct size of the data without doing changes that potentially alter behavior of the code coming after? Something like putting together those parallelized tensors for one line of code to get the size and then immediately parallelizing them again.
question from:
https://stackoverflow.com/questions/66063723/different-behavior-of-tensors-debugging-vs-normal-execution-on-gpu 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…