在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):franko/luajit-lang-toolkit开源软件地址(OpenSource Url):https://github.com/franko/luajit-lang-toolkit开源编程语言(OpenSource Language):Lua 85.7%开源软件介绍(OpenSource Introduction):LuaJIT Language ToolkitThe LuaJIT Language Toolkit is an implementation of the Lua programming language written in Lua itself. It works by generating LuaJIT bytecode, including debug information, and uses LuaJIT's virtual machine to run the generated bytecode. On its own, the language toolkit does not do anything useful, since LuaJIT itself does the same things natively. The purpose of the language toolkit is to provide a starting point to implement a programming language that targets the LuaJIT virtual machine. With the LuaJIT Language Toolkit, it is easy to create a new language or modify the Lua language because the parser is cleanly separated from the bytecode generator and the virtual machine. The toolkit implements a complete pipeline to parse a Lua program, generate an AST, and generate the corresponding bytecode. LexerIts role is to recognize lexical elements from the program text. It takes the text of the program as input and produces a stream of "tokens" as its output. Using the language toolkit you can run the lexer only, to examinate the stream of tokens:
The command above will lex the following code fragment: local x = {}
for k = 1, 10 do
x[k] = k*k + 1
end ...to generate the list of tokens:
Each line represents a token where the first element is the kind of token and the second element is its value, if any. The Lexer's code is an almost literal translation of the LuaJIT's lexer. ParserThe parser takes the token stream from the lexer and builds statements and expressions according to the language's grammar. The parser is based on a list of parsing rules that are invoked each time the input matches a given rule. When the input matches a rule, a corresponding function in the AST (abstract syntax tree) module is called to build an AST node. The generated nodes in turns are passed as arguments to the other parsing rules until the whole program is parsed and a complete AST is built for the program text. The AST is very useful as an abstraction of the structure of the program, and is easier to manipulate. What distinguishes the language toolkit from LuaJIT is that the parser phase generates an AST, and the bytecode generation is done in a separate phase only when the AST is complete. LuaJIT itself operates differently. During the parsing phase it does not generate any AST but instead the bytecode is directly generated and loaded into the memory to be executed by the VM. This means that LuaJIT's C implementation performs the three operations:
in one single pass. This approach is remarkable and very efficient, but makes it difficult to modify or extend the programming language. Parsing Rule exampleTo illustrate how parsing works in the language toolkit, let us make an example. The grammar rule for the "return" statement is:
In this case the toolkit parser's rule will parse the optional expression list by calling the function local function parse_return(ast, ls, line)
ls:next() -- Skip 'return'.
ls.fs.has_return = true
local exps
if EndOfBlock[ls.token] or ls.token == ';' then -- Base return.
exps = { }
else -- Return with one or more values.
exps = expr_list(ast, ls)
end
return ast:return_stmt(exps, line)
end As you can see, the AST functions are invoked using the In addition, the parser provides information about:
The first is used to keep track of some information about the current function being parsed. The syntactic scope rules tell the user's rule when a new syntactic block begins or end. Currently this is not really used by the AST builder but it can be useful for other implementations. The Abstract Syntax Tree (AST)The abstract syntax tree represent the whole Lua program, with all the information the parser has gathered about it. One possible approach to implement a new programming language is to generate an AST that more closely corresponds to the target programming language, and then transform the tree into a Lua AST in a separate phase. Another possible approach is to directly generate the appropriate Lua AST nodes from the parser itself. Currently the language toolkit does not perform any additional transformations, and just passes the AST to the bytecode generator module. Bytecode GeneratorOnce the AST is generated, it can be fed to the bytecode generator module, which will generate the corresponding LuaJIT bytecode. The bytecode generator is based on the original work of Richard Hundt for the Nyanga programming language. It was largely modified by myself to produce optimized code similar to what LuaJIT would generate, itself. A lot of work was also done to ensure the correctness of the bytecode and of the debug information. Alternative Lua Code generatorInstead of passing the AST to the bytecode generator, an alternative module can be used to generate Lua code. The module is called "luacode-generator" and can be used exactly like the bytecode generator. The Lua code generator has the advantage of being more simple and more safe as the code is parsed directly by LuaJIT, ensuring from the beginning complete compatibility of the bytecode. Currently the Lua Code Generator backend does not preserve the line numbers of the original source code. This is meant to be fixed in the future. Use this backend instead of the bytecode generator if you prefer to have a more safe backend to convert the Lua AST to code. The module can also be used for pretty-printing a Lua AST, since the code itself is probably the most human readable representation of the AST. C APIThe language toolkit provides a very simple set of C APIs to implement a custom language. The functions provided by the C API are: /* The functions above are the equivalent of the luaL_* corresponding
functions. */
extern int language_init(lua_State *L);
extern int language_report(lua_State *L, int status);
extern int language_loadbuffer(lua_State *L, const char *buff, size_t sz, const char *name);
extern int language_loadfile(lua_State *L, const char *filename);
/* This function push on the stack a Lua table with the functions:
loadstring, loadfile, dofile and loader.
The first three function can replace the Lua functions while the
last one, loader, can be used as a customized "loader" function for
the "require" function. */
extern int luaopen_langloaders(lua_State *L);
/* OPTIONAL:
Load into package.preload lang.* modules using embedded bytecode. */
extern void language_bc_preload(lua_State *L) The functions above can be used to create a custom LuaJIT executable that use the language toolkit implementation. When the function The function How to buildThe LuaJIT Language toolkit can be compiled and optionally installed using Meson. Ensure that Meson is installed, the easyest way is to use PIP, the Python installer. Ensure also that LuaJIT is correctly installed since it is required for the language toolkit. Once Meson and LuaJIT are installed configure the build with the command: meson setup build so that the 'build' directory will be used to build. You may also pass the preload option: meson setup -Dpreload=true build then to build use 'ninja', the default Meson's backend. # build
ninja -C build
# install
ninja -C build install The Meson-based build will take care of installing all the required Lua files, the library itself, the luajit-x executable and a pkg-config file. Please note that when using the 'preload' option the Lua files will not be installed since they are embedded in the library itself. Running the ApplicationThe application can be run with the following command:
The "run.lua" script will just invoke the complete pipeline of the lexer, parser and bytecode generator and it will pass the bytecode to luajit with "loadstring". The language toolkit also provides a customized executable named In the standard build This means that you can experiment with the language by modifying the Lua implementation of the language and test the changes immediately.
If the option If you works with the Lua files of the language toolkit you may choose to disable the Generated BytecodeYou can inspect the bytecode generated by the language toolkit by using the "-b" options.
They can be invoked either with standard luajit by using "run.lua" or directly using the customized program For example you can inspect the bytecode using the following command:
or alternatively:
where we suppose that you are running Either way, when you use one of the two commands above to generate the bytecode you will the see following on the screen:
You can compare it with the bytecode generated natively by LuaJIT using the command:
In the example above the generated bytecode will be identical to that generated by LuaJIT. This is not an accident, since the Language Toolkit's bytecode generator is designed to produce the same bytecode that LuaJIT itself would generate. In some cases, the generated code will differ. But, this is not considered a big problem as long as the generated code is still semantically correct. Bytecode Annotated DumpIn addition to the standard LuaJIT bytecode functions, the language toolkit also supports a special debug mode where the bytecode is printed byte-by-byte in hex format with some annotations on the right side of the screen. The annotations will explain the meaning of each chunk of bytes and decode them as appropriate. For example:
will display something like:
This kind of output is especially useful for debugging the language toolkit itself because it does account for every byte of the bytecode and include all the sections of the bytecode.
For example, you will be able to inspect the There is a small trick to compare with the bytecode generated by LuaJIT because this latter it doesn't support the
and then you can use the language toolkit with the
so that you can compare the two outputs. Current StatusCurrently LuaJIT Language Toolkit should be considered as beta software. The implementation is now complete in term of features and well tested, even for the most complex cases, and a complete test suite is used to verify the correctness of the generated bytecode. The language toolkit is currently capable of executing itself. This means that the language toolkit is able to correctly compile and load all of its module and execute them correctly. Yet some bugs are probably present and you should be cautious when you use LuaJIT language toolkit. |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论