Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
299 views
in Technique[技术] by (71.8m points)

python - urllib.quote() throws KeyError

To encode the URI, I used urllib.quote("sch?nefeld") but when some non-ascii characters exists in string, it thorws

KeyError: u'xe9'
Code: return ''.join(map(quoter, s))

My input strings are k?ln, br?nsh?j, sch?nefeld etc.

When I tried just printing statements in windows(Using python2.7, pyscripter IDE). But in linux it raises exception (I guess platform doesn't matter).

This is what I am trying:

from commands import getstatusoutput
queryParams = "sch?nefeld";
cmdString = "http://baseurl" + quote(queryParams)
print getstatusoutput(cmdString)

Exploring the issue reason: in urllib.quote(), actually exception being throwin at return ''.join(map(quoter, s)).

The code in urllib is:

def quote(s, safe='/'):
    if not s:
        if s is None:
            raise TypeError('None object cannot be quoted')
        return s
     cachekey = (safe, always_safe)
     try:
         (quoter, safe) = _safe_quoters[cachekey]
     except KeyError:
         safe_map = _safe_map.copy()
         safe_map.update([(c, c) for c in safe])
         quoter = safe_map.__getitem__
         safe = always_safe + safe
         _safe_quoters[cachekey] = (quoter, safe)
      if not s.rstrip(safe):
         return s
      return ''.join(map(quoter, s))

The reason for exception is in ''.join(map(quoter, s)), for every element in s, quoter function will be called and finally the list will be joined by '' and returned.

For non-ascii char è, the equivalent key will be %E8 which presents in _safe_map variable. But when I am calling quote('è'), it searches for the key xe8. So that the key does not exist and exception thrown.

So, I just modifed s = [el.upper().replace("\X","%") for el in s] before calling ''.join(map(quoter, s)) within try-except block. Now it works fine.

But I am annoying what I have done is correct approach or it will create any other issue? And also I do have 200+ instances of linux which is very tough to deploy this fix in all instances.

question from:https://stackoverflow.com/questions/15115588/urllib-quote-throws-keyerror

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You are trying to quote Unicode data, so you need to decide how to turn that into URL-safe bytes.

Encode the string to bytes first. UTF-8 is often used:

>>> import urllib
>>> urllib.quote(u'schxe9nefeld')
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py:1268: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  return ''.join(map(quoter, s))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1268, in quote
    return ''.join(map(quoter, s))
KeyError: u'xe9'
>>> urllib.quote(u'schxe9nefeld'.encode('utf8'))
'sch%C3%A9nefeld'

However, the encoding depends on what the server will accept. It's best to stick to the encoding the original form was sent with.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...