Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
694 views
in Technique[技术] by (71.8m points)

unicode - Python UTF-8 Lowercase Turkish Specific Letter

with using python 2.7:

>myCity = 'Isparta'
>myCity.lower()
>'isparta'
#-should be-
>'?sparta'

tried some decoding, (like, myCity.decode("utf-8").lower()) but could not find how to do it.

how can lower this kinds of letters? ('I' > '?', '?' > 'i' etc)

EDIT: In Turkish, lower case of 'I' is '?'. Upper case of 'i' is '?'

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Some have suggested using the tr_TR.utf8 locale. At least on Ubuntu, perhaps related to this bug, setting this locale does not produce the desired result:

import locale
locale.setlocale(locale.LC_ALL, 'tr_TR.utf8')

myCity = u'Isparta ?sparta'
print(myCity.lower())
# isparta isparta

So if this bug affects you, as a workaround you could perform this translation yourself:

lower_map = {
    ord(u'I'): u'?',
    ord(u'?'): u'i',
    }

myCity = u'Isparta ?sparta'
lowerCity = myCity.translate(lower_map)
print(lowerCity)
# ?sparta isparta

prints

?sparta isparta

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...