I tried to run tensorflow-wavenet on the google cloud ml-engine with gcloud ml-engine jobs submit training
but the cloud job crashed when it was trying to read the json configuration file:
with open(args.wavenet_params, 'r') as f:
wavenet_params = json.load(f)
arg.wavenet_params
is simply a file path to a json file which I uploaded to the google cloud storage bucket. The file path looks like this: gs://BUCKET_NAME/FILE_PATH.json
.
I double-checked that the file path is correct and I'm sure that this part is responsible for the crash since I commented out everything else.
The crash log file doesn't give much information about what has happened:
Module raised an exception for failing to call a subprocess Command '['python', '-m', u'gcwavenet.train', u'--data_dir', u'gs://wavenet-test-data/VCTK-Corpus-Small/', u'--logdir_root', u'gs://wavenet-test-data//gcwavenet10/logs']' returned non-zero exit status 1.
I replaced wavenet_params = json.load(f)
by f.close()
and I still get the same result.
Everything works when I run it locally with gcloud ml-engine local train
.
I think the problem is with reading files with gcloud ml-engine
in general or that I can't access the google cloud bucket from within a python file with gs://BUCKET_NAME/FILE_PATH
.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…