python - ISO 8859-1 filename not decoding

Question

Welcome To Ask or Share your Answers For Others

python - ISO 8859-1 filename not decoding

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - ISO 8859-1 filename not decoding

I'm extracting files from MIME messages in a python milter and am running across issues with files named as such:

=?ISO-8859-1?Q?Certificado=5FZonificaci=F3n=5F2010=2Epdf?=

I can't seem to decode this name into UTF. In order to solve a prior ISO-8859-1 issue, I started passing all filenames to this function:

def unicodeConvert(self, fname):
    normalized = False

    while normalized == False:
        try:
            fname  = unicodedata.normalize('NFKD', unicode(fname, 'utf-8')).encode('ascii', 'ignore')
            normalized = True
        except UnicodeDecodeError:
            fname = fname.decode('iso-8859-1')#.encode('utf-8')
            normalized = True
        except UnicodeError:
            fname = unicode(fname.content.strip(codecs.BOM_UTF8), 'utf-8')
            normalized = True
        except TypeError:
            fname = fname.encode('utf-8')

    return fname

which was working until I got to this filename.

Ideas are appreciated as always.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:24:41+0000

Your string is encoded using the Quoted-printable format for MIME headers. The email.header module handles this for you:

>>> from email.header import decode_header
>>> try:
...     string_type = unicode  # Python 2
... except NameError:
...     string_type = str      # Python 3
...
>>> for part in decode_header('=?ISO-8859-1?Q?Certificado=5FZonificaci=F3n=5F2010=2Epdf?='):
...     decoded = string_type(*part)
...     print(decoded)
...
Certificado_Zonificación_2010.pdf

Categories

python - ISO 8859-1 filename not decoding

python - ISO 8859-1 filename not decoding

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags