Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
382 views
in Technique[技术] by (71.8m points)

utf 8 - Python decode to arabic

I am using Apache Airflow, and this is a function in my PythonOperator to collect the data. And my data has arabic strings. Now when I execute the query without setting character to utf8 I get "???? ???". And so I did this set upbelow. But still the issue is not solved, I get at the end "ù?-ù?ˉ ?§ùù?§???±ù"

In arabic it supposed to be "???? ???????"

query = "select * from test_sample limit 1;"
source_hook = MySqlHook(mysql_conn_id='mysql_conn', schema='mysql')
source_conn = source_hook.get_conn()
source_cursor = source_conn.cursor()
source_cursor.execute("SET NAMES utf8;") 
source_cursor.execute("SET CHARACTER SET utf8;") 
source_cursor.execute("SET character_set_connection=utf8;")
source_cursor.execute(query)    
columns = [col[0] for col in source_cursor.description]
records_data = [dict(zip(columns, row)) for row in source_cursor.fetchall()]
record = records_data[0]
test_a = record['name']
print(test_a)

You can check in this link here , when you paste ù?-ù?ˉ ?§ùù?§???±ù and you can see the arabic output above ???? ??????? . But I'm not able to get it in my code. Any ideas?

I created my table as follow:

CREATE TABLE IF NOT EXISTS `test_sample` (
  `ID` int(10) NOT NULL,
  `name` varchar(255)  DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
question from:https://stackoverflow.com/questions/65641879/python-decode-to-arabic

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Thanks Rick and everyone for your response !

I will post here how i resolved my issue :

After digging much deeper on the issues where actually it wasn't necessary because the answer was in front of me all the time lol. Since I'm using Apache Airflow, and the connection configuration I missed to add was in extra {"charset":"utf8"} . And did solved it ! here is the reference (https://airflow.apache.org/docs/apache-airflow-providers-mysql/stable/connections/mysql.html)

Additionally my mysql.cnf configuration is set-not sure if it'll make any difference though- as:

[client]
default-character-set = utf8mb4
[mysqld]
skip-character-set-client-handshake
character-set-server = utf8mb4
collation-server = utf8mb4_general_ci
init-connect = SET NAMES utf8mb4
[mysql]
default-character-set = utf8mb4

Also for table:

CREATE TABLE IF NOT EXISTS `test_sample` (
  `ID` int(10) NOT NULL,
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci;

And I removed all the -not needed anymore- in MySqlHook connection code:

source_cursor.execute("SET NAMES utf8;") 
source_cursor.execute("SET CHARACTER SET utf8;") 
source_cursor.execute("SET character_set_connection=utf8;")

Hope it helps others the same, Cheers!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...