Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
309 views
in Technique[技术] by (71.8m points)

google cloud dataflow - How does apache beam access bigtable data?

If BigtableIO.Read is run in dataflow, is the data being accessed via a bigtable node or going directly to bigtable tablets?

Bigtable architecture has:

client requests go through a front-end server before they are sent to a Cloud Bigtable node

and goes on to say:

A Cloud Bigtable table is sharded into blocks of contiguous rows, called tablets to help balance the workload of queries... Tablets are stored on Colossus, Google's file system, in SSTable format

(The concern is if there is a dataflow job running at the same as users are making individual request that definitely go through the nodes, whether there will be a small or large amount of contention from the dataflow job. I would guess that if the dataflow job went through the nodes there would be significantly more contention as opposed to hitting the tablets directly.)

question from:https://stackoverflow.com/questions/65876283/how-does-apache-beam-access-bigtable-data

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Beam BigTable connector uses the Cloud BigTable's public API hence requests will be going through the BigTable front end server nodes.

See here for bit more detail regarding BigTable client API usage of the Beam connector.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...