Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.5k views
in Technique[技术] by (71.8m points)

mysql - How to encode (utf8mb4) in Python

How do I encode something in ut8mb4 in Python?

I have two sets of data: data I am migrating to my new MySQL database over from Parse, and data going forward (that talks only to my new database). My database is utf8mb4 in order to store emoji and accented letters.

The first set of data only shows up correctly (when emoji and accents are involved) when I have in my python script:

MySQLdb.escape_string(unicode(xstr(data.get('message'))).encode('utf-8')) 

and when reading from the MySQL database in PHP:

$row["message"] = utf8_encode($row["message"]);

The second set of data only shows up correctly (when emoji and accents are involved) when I DON'T include the utf8_encode($row["message"]) portion. I am trying to reconcile these so that both sets of data are returned correctly to my iOS app. Please help!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I have struggled myself with the correct exchange of the full range of UTF-8 characters between Python and MySQL for the sake of Emoji and other characters beyond the U+FFFF codepoint.

To be sure that everything worked fine, I had to do the following:

  1. make sure utf8mb4 was used for CHAR, VARCHAR, and TEXT columns in MySQL
  2. enforce UTF-8 in Python
  3. enforce UTF-8 to be used between Python and MySQL

To enforce UTF-8 in Python, add the following line as first or second line of your Python script:

# -*- coding: utf-8 -*-

To enforce UTF-8 between Python and MySQL, setup the MySQL connection as follows:

# Connect to mysql.
dbc = MySQLdb.connect(host='###', user='###', passwd='###', db='###', use_unicode=True)

# Create a cursor.
cursor = dbc.cursor()

# Enforce UTF-8 for the connection.
cursor.execute('SET NAMES utf8mb4')
cursor.execute("SET CHARACTER SET utf8mb4")
cursor.execute("SET character_set_connection=utf8mb4")

# Do database stuff.

# Commit data.
dbc.commit()

# Close cursor and connection.
cursor.close()
dbc.close()

This way, you don't need to use functions such as encode and utf8_encode.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...