Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
228 views
in Technique[技术] by (71.8m points)

How do I use Signed URLs to upload to Google Storage bucket in Python?

I am able to create signed URLs and just need to know what to do with them after they are created.

There are several examples using Javascript to upload via a signed URL, but I cannot find any in Python. I am trying to use signed URLs as a workaround for the 32 MB limit imposed by Google App Engine for my Flask application.

Here is my python app.py script (not full functionality of my app here, just trying to upload to my bucket successfully):

from flask import Flask, request, render_template
from google.cloud import storage
import pandas as pd
import os
import gcsfs

bucket_name = "my-bucket"

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/path/to/file.json'

app = Flask(__name__)

def upload_blob(bucket_name, source_file_name, destination_blob_name):
    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)
    blob.upload_from_file(source_file_name)

    print("success")

@app.route('/')
def homepage():
    return render_template('home.html')

@app.route('/', methods = ['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        file1 = request.files['file1'] 
        file2 = request.files['file2']
        upload_blob(bucket_name, file1, 'file-1')
        upload_blob(bucket_name, file2, 'file-2')
        df = pd.read_csv('gs://' + bucket_name + '/' + 'file-1')
        print(df.shape)
        return "done"


if __name__ == "__main__":
  app.run(debug=True)

Here is the function I am using to create the signed URL:

def generate_upload_signed_url_v4(bucket_name, blob_name):

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(blob_name)

    url = blob.generate_signed_url(
        version="v4",
        # This URL is valid for 15 minutes
        expiration=datetime.timedelta(minutes=15),
        # Allow GET requests using this URL.
        method="PUT",
        content_type="application/octet-stream",
    )
    print(url)
    return url

generate_upload_signed_url_v4(bucket_name, 'file.csv')

And below is my home.html:

<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset="UTF-8">
   <title>test upload</title>
</head>
<body>
    <h3> test upload </h3>

    <form method="POST" action="/" enctype="multipart/form-data">
        <p>Upload file1 below</p>
        <input type="file" name="file1"> 
        <br>
        <br>
        <p>Upload file2 below</p>
        <input type="file" name="file2">
        <br>
        <br>
        <input type="submit" value="upload">
    </form>


</body>
</html>

Based on what I researched here is my CORS configuration for the bucket I am trying to upload to:


[
{"maxAgeSeconds": 3600, 
"method": ["GET", "PUT", "POST"], 
"origin": ["https://my-app.uc.r.appspot.com", "http://local.machine.XXXX/"], 
"responseHeader": ["Content-Type"]}
]

Does the signed URL that is generated go in the html form? Does it need to go into my upload_file function?

Finally, when I paste the signed URL into my browser it shows this error:


<Error>
<Code>MalformedSecurityHeader</Code>
<Message>Your request has a malformed header.</Message>
<ParameterName>content-type</ParameterName>
<Details>Header was included in signedheaders, but not in the request.</Details>
</Error>

This is my first SO question so I apologize if it is poorly constructed. I am super lost and new to GCP. I have searched SO for a while now, and not found a use-case with Python/Flask where I can see how the signed URL is incorporated into the file upload process.

Again, I am building a webapp on Google App Engine flex, and need signed URLs to workaround the 32 MB file upload restriction.

UPDATE

I got the signed URL component figured out after realizing I needed to simply make a request to the signed URL.

Below is my new script that is loaded in App Engine (imports and "if name = main..." removed for snippet below).


os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/path/to/file.json'

EXPIRATION = datetime.timedelta(minutes=15)
FILE_TYPE = 'text/csv'
BUCKET = 'my-bucket'

def upload_via_signed(bucket_name, blob_name, filename, expiration, file_type):
    bucket = storage.Client().get_bucket(bucket_name)

    blob = bucket.blob(blob_name)

    signed_url = blob.generate_signed_url(method='PUT', expiration=expiration, content_type=file_type)

    requests.put(signed_url, open(filename.filename, 'rb'), headers={'Content-Type': file_type})

app = Flask(__name__)

app.config['UPLOAD_FOLDER'] = '/tmp'

@app.route('/')
def homepage():
    return render_template('home.html')

@app.route('/', methods = ['GET', 'POST'])
def upload_file():
    if request.method == 'POST':

        diag = request.files['file']
        filename_1 = secure_filename(diag.filename)
        filepath_1 = os.path.join(app.config['UPLOAD_FOLDER'], filename_1)
        diag.save(filepath_1)

        person = request.files['person']
        filename_2 = secure_filename(person.filename)
        filepath_2 = os.path.join(app.config['UPLOAD_FOLDER'], filename_2)
        person.save(filepath_2)

        upload_via_signed(BUCKET, 'diag.csv', diag, EXPIRATION, FILE_TYPE)

        upload_via_signed(BUCKET, 'person.csv', person, EXPIRATION, FILE_TYPE)

        df_diag = pd.read_csv('gs://' + BUCKET + '/' + 'diag.csv')
        print(df_diag.shape)
        return "done"

The code above is still throwing the 413 entity too large error. I think it's because I've got the 'POST' going through App Engine even though I am creating signed URLs. How do I need to re-arrange/what am I doing wrong? How does the code need to be structured to have the user upload directly to Google Cloud Storage via the signed URLs and avoid triggering the 413 entity too large error?

question from:https://stackoverflow.com/questions/65924258/how-do-i-use-signed-urls-to-upload-to-google-storage-bucket-in-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Once you have generated the signed url on the server, you just need to send it back to the client and use it to upload your files. you can for example send file data using normal fetch put request or as I prefer always using axios:

await axios.put(url, file);

the url here is the signed url. you may want to send your files as formData


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...