本文整理汇总了Python中thirdparty.chardet.detect函数的典型用法代码示例。如果您正苦于以下问题:Python detect函数的具体用法?Python detect怎么用?Python detect使用的例子?那么恭喜您, 这里精选的函数代码示例或许可以为您提供帮助。
在下文中一共展示了detect函数的6个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于我们的系统推荐出更棒的Python代码示例。
示例1: removeDynamicContent
def removeDynamicContent(self, page, dynamicMarks):
"""
Removing dynamic content from supplied page basing removal on
precalculated dynamic markings
"""
if page and len(dynamicMarks) > 0:
encoding = chardet.detect(page)['encoding']
page = page.decode(encoding, errors='replace')
for item in dynamicMarks:
prefix, suffix = item
if prefix is not None:
prefix = prefix.decode(encoding, errors='replace')
if suffix is not None:
suffix = suffix.decode(encoding, errors='replace')
if prefix is None and suffix is None:
continue
elif prefix is None:
page = re.sub(r'(?s)^.+{0}'.format(re.escape(suffix)), suffix.replace('\\', r'\\'), page)
elif suffix is None:
page = re.sub(r'(?s){0}.+$'.format(re.escape(prefix)), prefix.replace('\\', r'\\'), page)
else:
page = re.sub(r'(?s){0}.+{1}'.format(re.escape(prefix), re.escape(suffix)), "{0}{1}".format(prefix.replace('\\', r'\\'), suffix.replace('\\', r'\\')), page)
page = page.encode()
return page
开发者ID:a13409440944,项目名称:dirsearch,代码行数:27,代码来源:DynamicContentParser.py
示例2: getHeuristicCharEncoding
def getHeuristicCharEncoding(page):
"""
Returns page encoding charset detected by usage of heuristics
Reference: http://chardet.feedparser.org/docs/
"""
retVal = detect(page)["encoding"]
infoMsg = "heuristics detected web page charset '%s'" % retVal
singleTimeLogMessage(infoMsg, logging.INFO, retVal)
return retVal
开发者ID:yowie,项目名称:sqlmap,代码行数:11,代码来源:basic.py
示例3: _detectEncodeType
def _detectEncodeType(self, content):
result = {}
for key,value in self._bomList.iteritems():
if content.startswith(value):
result['encoding'] = key + "-bom"
result['confidence'] = 0.80
break
else:
result = chardet.detect(content)
return result
开发者ID:Catcherman,项目名称:pentestdb,代码行数:12,代码来源:coder.py
示例4: getHeuristicCharEncoding
def getHeuristicCharEncoding(page):
"""
Returns page encoding charset detected by usage of heuristics
Reference: http://chardet.feedparser.org/docs/
"""
key = hash(page)
retVal = kb.cache.encoding.get(key) or detect(page)["encoding"]
kb.cache.encoding[key] = retVal
if retVal:
infoMsg = "heuristics detected web page charset '%s'" % retVal
singleTimeLogMessage(infoMsg, logging.INFO, retVal)
return retVal
开发者ID:sdlirjc,项目名称:algorithm,代码行数:15,代码来源:basic.py
示例5: detect
def detect(self, size=2048):
'''
文件编码类型推断
'''
content = open(self.fileName,"rb").read(size)
result = dict()
for key,value in self._bomList.iteritems():
if content.startswith(value):
result['encoding'] = key + "-bom"
result['confidence'] = 0.80
break
else:
result = chardet.detect(content)
return result
开发者ID:mrphishxxx,项目名称:pentestdb,代码行数:15,代码来源:coder.py
示例6: detect
def detect(self):
'''
非ASCII字符串编码类型推断
'''
rawstr = "".join([x[1] for x in self._autoPreDecode()])
return chardet.detect(rawstr)
开发者ID:Catcherman,项目名称:pentestdb,代码行数:6,代码来源:coder.py
注:本文中的thirdparty.chardet.detect函数示例由纯净天空整理自Github/MSDocs等源码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。 |
请发表评论