I have struggled myself with the correct exchange of the full range of UTF-8 characters between Python and MySQL for the sake of Emoji and other characters beyond the U+FFFF codepoint.
To be sure that everything worked fine, I had to do the following:
- make sure
utf8mb4
was used for CHAR
, VARCHAR
, and TEXT
columns in MySQL
- enforce UTF-8 in Python
- enforce UTF-8 to be used between Python and MySQL
To enforce UTF-8 in Python, add the following line as first or second line of your Python script:
# -*- coding: utf-8 -*-
To enforce UTF-8 between Python and MySQL, setup the MySQL connection as follows:
# Connect to mysql.
dbc = MySQLdb.connect(host='###', user='###', passwd='###', db='###', use_unicode=True)
# Create a cursor.
cursor = dbc.cursor()
# Enforce UTF-8 for the connection.
cursor.execute('SET NAMES utf8mb4')
cursor.execute("SET CHARACTER SET utf8mb4")
cursor.execute("SET character_set_connection=utf8mb4")
# Do database stuff.
# Commit data.
dbc.commit()
# Close cursor and connection.
cursor.close()
dbc.close()
This way, you don't need to use functions such as encode
and utf8_encode
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…