Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
584 views
in Technique[技术] by (71.8m points)

numpy - How does tensorflow batch_matmul work?

Tensorflow has a function called batch_matmul which multiplies higher dimensional tensors. But I'm having a hard time understanding how it works, perhaps partially because I'm having a hard time visualizing it.

enter image description here

What I want to do is multiply a matrix by each slice of a 3D tensor, but I don't quite understand what the shape of tensor a is. Is z the innermost dimension? Which of the following is correct?

enter image description here

I would most prefer the first to be correct -- it's most intuitive to me and easy to see in the .eval() output. But I suspect the second is correct.

Tensorflow says that batch_matmul performs:

out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :])

What does that mean? What does that mean in the context of my example? What is being multiplied with with what? And why aren't I getting a 3D tensor the way I expected?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can imagine it as doing a matmul over each training example in the batch.

For example, if you have two tensors with the following dimensions:

a.shape = [100, 2, 5]
b.shape = [100, 5, 2]

and you do a batch tf.matmul(a, b), your output will have the shape [100, 2, 2].

100 is your batch size, the other two dimensions are the dimensions of your data.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...