gcloud - Google Cloud Sdk from DataProc Cluster -


what right way use/install python google cloud apis such pub-sub google-dataproc cluster? example if im using zeppelin/pyspark on cluster , want use pub-sub api, how should prepare it?

it unclear me installed , not installed during default cluster provisioning , if/how should try install python libraries google cloud apis.

i realise additionally there may scopes/authentication setup. clear, can use apis locally not sure cleanest way make apis accessible cluster , dont want perform unnecessary steps.

in general, @ moment, need bring own client libraries various google apis unless using google cloud storage connector or bigquery connector java or via rdd methods in pyspark automatically delegate java implementations.

for authentication, should use --scopes https://www.googleapis.com/auth/pubsub and/or --scopes https://www.googleapis.com/auth/cloud-platform , service account on dataproc cluster's vms able authenticate use pubsub via default installed credentials flow.


Comments