• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

Stepets/utf8.lua: pure-lua 5.3 regex library

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

Stepets/utf8.lua

开源软件地址(OpenSource Url):

https://github.com/Stepets/utf8.lua

开源编程语言(OpenSource Language):

Lua 99.6%

开源软件介绍(OpenSource Introduction):

utf8.lua

pure-lua 5.3 regex library for Lua 5.3, Lua 5.1, LuaJIT

This library provides simple way to add UTF-8 support into your application.

Example:

local utf8 = require('.utf8'):init()
for k,v in pairs(utf8) do
  string[k] = v
end

local str = "пыщпыщ ололоо я водитель нло"
print(str:find("(.л.+)н"))
-- 8	26	ололоо я водитель

print(str:gsub("ло+", "보라"))
-- пыщпыщ о보라보라 я водитель н보라	3

print(str:match("^п[лопыщ ]*я"))
-- пыщпыщ ололоо я

Usage:

This library can be used as drop-in replacement for vanilla string library. It exports all vanilla functions under raw sub-object.

local utf8 = require('.utf8'):init()
local str = "пыщпыщ ололоо я водитель нло"
utf8.gsub(str, "ло+", "보라")
-- пыщпыщ о보라보라 я водитель н보라	3
utf8.raw.gsub(str, "ло+", "보라")
-- пыщпыщ о보라보라о я водитель н보라	3

It also provides all functions from Lua 5.3 UTF-8 module except utf8.len (s [, i [, j]]). If you need to validate your strings use utf8.validate(str, byte_pos) or iterate over with utf8.validator.

Please note that library assumes regexes are valid UTF-8 strings, if you need to manipulate individual bytes use vanilla functions under utf8.raw.

Installation:

Download repository to your project folder. (no rockspecs yet)

Examples assume library placed under utf8 subfolder not utf8.lua.

As of Lua 5.3 default utf8 module has precedence over user-provided. In this case you can specify full module path (.utf8).

Configuration:

Library is highly modular. You can provide your implementation for almost any function used. Library already has several back-ends:

Probably most interesting customizations are utf8.config.loadstring and utf8.config.cache if you want to precompile your regexes.

local utf8 = require('.utf8')
utf8.config = {
  cache = my_smart_cache,
}
utf8:init()

For lower and upper functions to work in environments where ffi cannot be used, you can specify substitution tables (data example)

local utf8 = require('.utf8')
utf8.config = {
  conversion = {
    uc_lc = utf8_uc_lc,
    lc_uc = utf8_lc_uc
  },
}
utf8:init()

Customization is done before initialization. If you want, you can change configuration after init, it might work for everything but modules. All of them should be reloaded.

Documentation:

Issue reporting:

Please provide example script that causes error together with environment description and debug output. Debug output can be obtained like:

local utf8 = require('.utf8')
utf8.config = {
  debug = utf8:require("util").debug
}
utf8:init()
-- your code

Default logger used is io.write and can be changed by specifying logger = my_logger in configuration




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
satoren/luaOpenCV: OpenCV wrapper for Lua发布时间:2022-08-16
下一篇:
cloudwu/lua-mongo: A simple lua mongo driver发布时间:2022-08-16
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap