前言
我们经常会遇到这样的事情:有时候我们找到了一个库,但是这个库是用 TypeScript 写的,但是我们想在 C# 调用,于是我们需要设法将原来的 TypeScript 类型声明翻译成 C# 的代码,然后如果是 UI 组件的话,我们需要将其封装到一个 WebView 里面,然后通过 JavaScript 和 C# 的互操作功能来调用该组件的各种方法,支持该组件的各种事件等等。
但是这是一个苦力活,尤其是类型翻译这一步。
这个是我最近在帮助维护一个开源 UWP 项目 monaco-editor-uwp 所需要的,该项目将微软的 monaco 编辑器封装成了 UWP 组件。
然而它的 monaco.d.ts 足足有 1.5 mb,并且 API 经常会变化,如果人工翻译,不仅工作量十分大,还可能会漏掉新的变化,但是如果有一个自动生成器的话,那么人工的工作就会少很多。
目前 GitHub 上面有一个叫做 QuickType 的项目,但是这个项目对 TypeScript 的支持极其有限,仍然停留在 TypeScript 3.2,而且遇到不认识的类型就会报错,比如 DOM 类型等等。
因此我决定手写一个代码生成器 TypedocConverter:https://github.com/hez2010/TypedocConverter
构思
本来是打算从 TypeScript 词法和语义分析开始做的,但是发现有一个叫做 Typedoc 的项目已经帮我们完成了这一步,而且支持输出 JSON schema,那么剩下的事情就简单了:我们只需要将 TypeScript 的 AST 转换成 C# 的 AST,然后再将 AST 还原成代码即可。
那么话不多说,这就开写。
构建 Typescipt AST 类型绑定
借助于 F# 更加强大的类型系统,类型的声明和使用非常简单,并且具有完善的recursive pattern。pattern matching、option types 等支持,这也是该项目选用 F# 而不是 C# 的原因,虽然 C# 也支持这些,也有一定的 FP 能力,但是它还是偏 OOP,写起来会有很多的样板代码,非常的繁琐。
我们将 Typescipt 的类型绑定定义到 Definition.fs 中,这一步直接将 Typedoc 的定义翻译到 F# 即可:
首先是 ReflectionKind 枚举,该枚举表示了 JSON Schema 中各节点的类型:
type ReflectionKind = | Global = 0 | ExternalModule = 1 | Module = 2 | Enum = 4 | EnumMember = 16 | Variable = 32 | Function = 64 | Class = 128 | Interface = 256 | Constructor = 512 | Property = 1024 | Method = 2048 | CallSignature = 4096 | IndexSignature = 8192 | ConstructorSignature = 16384 | Parameter = 32768 | TypeLiteral = 65536 | TypeParameter = 131072 | Accessor = 262144 | GetSignature = 524288 | SetSignature = 1048576 | ObjectLiteral = 2097152 | TypeAlias = 4194304 | Event = 8388608 | Reference = 16777216
然后是类型修饰标志 ReflectionFlags,注意该 record 所有的成员都是 option 的
type ReflectionFlags = { IsPrivate: bool option IsProtected: bool option IsPublic: bool option IsStatic: bool option IsExported: bool option IsExternal: bool option IsOptional: bool option IsReset: bool option HasExportAssignment: bool option IsConstructorProperty: bool option IsAbstract: bool option IsConst: bool option IsLet: bool option }
然后到了我们的 Reflection,由于每一种类型的 Reflection 都可以由 ReflectionKind 来区分,因此我选择将所有类型的 Reflection 合并成为一个 record,而不是采用 Union Types,因为后者虽然看上去清晰,但是在实际 parse AST 的时候会需要大量 pattern matching 的代码。
由于部分 records 相互引用,因此我们使用 and
来定义 recursive records。
type Reflection = { Id: int Name: string OriginalName: string Kind: ReflectionKind KindString: string option Flags: ReflectionFlags Parent: Reflection option Comment: Comment option Sources: SourceReference list option Decorators: Decorator option Decorates: Type list option Url: string option Anchor: string option HasOwnDocument: bool option CssClasses: string option DefaultValue: string option Type: Type option TypeParameter: Reflection list option Signatures: Reflection list option IndexSignature: Reflection list option GetSignature: Reflection list option SetSignature: Reflection list option Overwrites: Type option InheritedFrom: Type option ImplementationOf: Type option ExtendedTypes: Type list option ExtendedBy: Type list option ImplementedTypes: Type list option ImplementedBy: Type list option TypeHierarchy: DeclarationHierarchy option Children: Reflection list option Groups: ReflectionGroup list option Categories: ReflectionCategory list option Reflections: Map<int, Reflection> option Directory: SourceDirectory option Files: SourceFile list option Readme: string option PackageInfo: obj option Parameters: Reflection list option } and DeclarationHierarchy = { Type: Type list Next: DeclarationHierarchy option IsTarget: bool option } and Type = { Type: string Id: int option Name: string option ElementType: Type option Value: string option Types: Type list option TypeArguments: Type list option Constraint: Type option Declaration: Reflection option } and Decorator = { Name: string Type: Type option Arguments: obj option } and ReflectionGroup = { Title: string Kind: ReflectionKind Children: int list CssClasses: string option AllChildrenHaveOwnDocument: bool option AllChildrenAreInherited: bool option AllChildrenArePrivate: bool option AllChildrenAreProtectedOrPrivate: bool option AllChildrenAreExternal: bool option SomeChildrenAreExported: bool option Categories: ReflectionCategory list option } and ReflectionCategory = { Title: string Children: int list AllChildrenHaveOwnDocument: bool option } and SourceDirectory = { Parent: SourceDirectory option Directories: Map<string, SourceDirectory> Groups: ReflectionGroup list option Files: SourceFile list Name: string option DirName: string option Url: string option } and SourceFile = { FullFileName: string FileName: string Name: string Url: string option Parent: SourceDirectory option Reflections: Reflection list option Groups: ReflectionGroup list option } and SourceReference = { File: SourceFile option FileName: string Line: int Character: int Url: string option } and Comment = { ShortText: string Text: string option Returns: string option Tags: CommentTag list option } and CommentTag = { TagName: string ParentName: string Text: string }
这样,我们就简单的完成了类型绑定的翻译,接下来要做的就是将 Typedoc 生成的 JSON 反序列化成我们所需要的东西即可。
反序列化
虽然想着好像一切都很顺利,但是实际上 System.Text.Json、Newtonsoft.JSON 等均不支持 F# 的 option types,所需我们还需要一个 JsonConverter 处理 option types。
本项目采用 Newtonsoft.Json,因为 System.Text.Json 目前尚不成熟。得益于 F# 对 OOP 的兼容,我们可以很容易的实现一个 OptionConverter
。
type OptionConverter() = inherit JsonConverter() override __.CanConvert(objectType: Type) : bool = match objectType.IsGenericType with | false -> false | true -> typedefof<_ option> = objectType.GetGenericTypeDefinition() override __.WriteJson(writer: JsonWriter, value: obj, serializer: JsonSerializer) : unit = serializer.Serialize(writer, if isNull value then null else let _, fields = FSharpValue.GetUnionFields(value, value.GetType()) fields.[0] ) override __.ReadJson(reader: JsonReader, objectType: Type, _existingValue: obj, serializer: JsonSerializer) : obj = let innerType = objectType.GetGenericArguments().[0] let value = serializer.Deserialize( reader, if innerType.IsValueType then (typedefof<_ Nullable>).MakeGenericType([|innerType|]) else innerType ) let cases = FSharpType.GetUnionCases objectType if isNull value then FSharpValue.MakeUnion(cases.[0], [||]) else FSharpValue.MakeUnion(cases.[1], [|value|])
这样所有的工作就完成了。
我们可以去 monaco-editor 仓库下载 monaco.d.ts 测试一下我们的 JSON Schema deserializer,可以发现 JSON Sechma 都被正确地反序列化了。
反序列化结果
构建 C# AST 类型
当然,此 "AST" 非彼 AST,我们没有必要其细化到语句层面,因为我们只是要写一个简单的代码生成器,我们只需要构建实体结构即可。
我们将实体结构定义到 Entity.fs 中,在此我们只需支持 interface、class、enum 即可,对于 class 和 interface,我们只需要支持 method、property 和 event 就足够了。
当然,代码中存在泛型的可能,这一点我们也需要考虑。
type EntityBodyType = { Type: string Name: string option InnerTypes: EntityBodyType list } type EntityMethod = { Comment: string Modifier: string list Type: EntityBodyType Name: string TypeParameter: string list Parameter: EntityBodyType list } type EntityProperty = { Comment: string Modifier: string list Name: string Type: EntityBodyType WithGet: bool WithSet: bool IsOptional: bool InitialValue: string option } type EntityEvent = { Comment: string Modifier: string list DelegateType: EntityBodyType Name: string IsOptional: bool } type EntityEnum = { Comment: string Name: string Value: int64 option } type EntityType = | Interface | Class | Enum | StringEnum type Entity = { Namespace: string Name: string Comment: string Methods: EntityMethod list Properties: EntityProperty list Events: EntityEvent list Enums: EntityEnum list InheritedFrom: EntityBodyType list Type: EntityType TypeParameter: string list Modifier: string list }
文档化注释生成器
文档化注释也是少不了的东西,能极大方便开发者后续使用生成的类型绑定,而无需参照原 typescript 类型声明上的注释。
代码很简单,只需要将文本处理成 xml 即可。
let escapeSymbols (text: string) = if isNull text then "" else text .Replace("&", "&") .Replace("<", "<") .Replace(">", ">") let toCommentText (text: string) = if isNull text then "" else text.Split "\n" |> Array.map (fun t -> "/// " + escapeSymbols t) |> Array.reduce(fun accu next -> accu + "\n" + next) let getXmlDocComment (comment: Comment) = let prefix = "/// <summary>\n" let suffix = "\n/// </summary>" let summary = match comment.Text with | Some text -> prefix + toCommentText comment.ShortText + toCommentText text + suffix | _ -> match comment.ShortText with | "" -> "" | _ -> prefix + toCommentText comment.ShortText + suffix let returns = match comment.Returns with | Some text -> "\n/// <returns>\n" + toCommentText text + "\n/// </returns>" | _ -> "" summary + returns
类型生成器
Typescript 的类型系统较为灵活,包括 union types、intersect types 等等,这些即使是目前的 C# 8 都不能直接表达,需要等到 C# 9 才行。当然我们可以生成一个 struct 并为其编写隐式转换操作符重载,支持 union types,但是目前尚未实现,我们就先用 union types 中的第一个类型代替,而对于 intersect types,我们姑且先使用 object。
然而 union types 有一个特殊情况:string literals types alias。就是这样的东西:
type Size = "XS" | "S" | "M" | "L" | "XL";
即纯 string 值组合的 type alias,这个我们还是有必要支持的,因为在 typescript 中用的非常广泛。
C# 在没有对应语法的时候要怎么支持呢?很简单,我们创建一个 enum,该 enum 包含该类型中的所有元素,然后我们为其编写 JsonConverter,这样就能确保序列化后,typescript 方能正确识别类型,而在 C# 又有 type sound 的编码体验。
另外,我们需要提供一些常用的类型转换:
-
Array<T>
->T[]
-
Set<T>
->System.Collections.Generic.ISet<T>
-
Map<T>
->System.Collections.Generic.IDictionary<T>
-
Promise<T>
->System.Threading.Tasks.Task<T>
- callbacks ->
System.Func<T...>
,System.Action<T...>
- Tuple 类型
- 其他的数组类型如
Uint32Array
- 对于
<void>
,我们需要解除泛型,即T<void>
->T
那么实现如下:
let rec getType (typeInfo: Type): EntityBodyType = let genericType = match typeInfo.Type with | "intrinsic" -> match typeInfo.Name with | Some name -> match name with | "number" -> { Type = "double"; InnerTypes = []; Name = None } | "boolean" -> { Type = "bool"; InnerTypes = []; Name = None } | "string" -> { Type = "string"; InnerTypes = []; Name = None } | "void" -> { Type = "void"; InnerTypes = []; Name = None } | _ -> { Type = "object"; InnerTypes = []; Name = None } | _ -> { Type = "object"; InnerTypes = []; Name = None } | "reference" | "typeParameter" -> match typeInfo.Name with | Some name -> match name with | "Promise" -> { Type = "System.Threading.Tasks.Task"; InnerTypes = []; Name = None } | "Set" -> { Type = "System.Collections.Generic.ISet"; InnerTypes = []; Name = None } | "Map" -> { Type = "System.Collections.Generic.IDictionary"; InnerTypes = []; Name = None } | "Array" -> { Type = "System.Array"; InnerTypes = []; Name = None } | "BigUint64Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "ulong"; InnerTypes = [ ]; Name = None };]; Name = None }; | "Uint32Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "uint"; InnerTypes = [ ]; Name = None };]; Name = None }; | "Uint16Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "ushort"; InnerTypes = [ ]; Name = None };]; Name = None }; | "Uint8Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "byte"; InnerTypes = [ ]; Name = None };]; Name = None }; | "BigInt64Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "long"; InnerTypes = [ ]; Name = None };]; Name = None }; | "Int32Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "int"; InnerTypes = [ ]; Name = None };]; Name = None }; | "Int16Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "short"; InnerTypes = [ ]; Name = None };]; Name = None }; | "Int8Array" -> { Type = "System.Array"; InnerTypes = [{ Type = "char"; InnerTypes = [ ]; Name = None };]; Name = None }; | "RegExp" -> { Type = "string"; InnerTypes = []; Name = None }; | x -> { Type = x; InnerTypes = []; Name = None }; | _ -> { Type = "object"; InnerTypes = []; Name = None } | "array" -> match typeInfo.ElementType with | Some elementType -> { Type = "System.Array"; InnerTypes = [getType elementType]; Name = None } | _ -> { Type = "System.Array"; InnerTypes = [{ Type = "object"; InnerTypes = []; Name = None }]; Name = None } | "stringLiteral" -> { Type = "string"; InnerTypes = []; Name = None } | "tuple" -> match typeInfo.Types with | Some innerTypes -> match innerTypes with | [] -> { Type = "object"; InnerTypes = []; Name = None } | _ -> { Type = "System.ValueTuple"; InnerTypes = innerTypes |> List.map getType; Name = None } | _ -> { Type = "object"; InnerTypes = []; Name = None } | "union" -> match typeInfo.Types with | Some innerTypes -> match innerTypes with | [] -> { Type = "object"; InnerTypes = []; Name = None } | _ -> printWarning ("Taking only the first type " + innerTypes.[0].Type + " for the entire union type.") getType innerTypes.[0] // TODO: generate unions | _ ->{ Type = "object"; InnerTypes = []; Name = None } | "intersection" -> { Type = "object"; InnerTypes = []; Name = None } // TODO: generate intersections | "reflection" -> match typeInfo.Declaration with | Some dec -> match dec.Signatures with | Some [signature] -> let paras = match signature.Parameters with | Some p -> p |> List.map (fun pi -> match pi.Type with | Some pt -> Some (getType pt) | _ -> None ) |> List.collect (fun x -> match x with | Some s -> [s] | _ -> [] ) | _ -> [] let rec getDelegateParas (paras: EntityBodyType list): EntityBodyType list = match paras with | [x] -> [{ Type = x.Type; InnerTypes = x.InnerTypes; Name = None }] | (front::tails) -> [front] @ getDelegateParas tails | _ -> [] let returnsType = match signature.Type with | Some t -> getType t | _ -> { Type = "void"; InnerTypes = []; Name = None } let typeParas = getDelegateParas paras match typeParas with | [] -> { Type = "System.Action"; InnerTypes = []; Name = None } | _ -> if returnsType.Type = "void" then { Type = "System.Action"; InnerTypes = typeParas; Name = None } else { Type = "System.Func"; InnerTypes = typeParas @ [returnsType]; Name = None } | _ -> { Type = "object"; InnerTypes = []; Name = None } | _ -> { Type = "object"; InnerTypes = []; Name = None } | _ -> { Type = "object"; InnerTypes = []; Name = None } let mutable innerTypes = match typeInfo.TypeArguments with | Some args -> getGenericTypeArguments args | _ -> [] if genericType.Type = "System.Threading.Tasks.Task" then match innerTypes with | (front::_) -> if front.Type = "void" then innerTypes <- [] else () | _ -> () else () { Type = genericType.Type; Name = None; InnerTypes = if innerTypes = [] then genericType.InnerTypes else innerTypes; } and getGenericTypeArguments (typeInfos: Type list): EntityBodyType list = typeInfos |> List.map getType and getGenericTypeParameters (nodes: Reflection list) = // TODO: generate constaints let types = nodes |> List.where(fun x -> x.Kind = ReflectionKind.TypeParameter) |> List.map (fun x -> x.Name) types |> List.map (fun x -> {| Type = x; Constraint = "" |})
当然,目前尚不支持生成泛型约束,如果以后有时间的话会考虑添加。
修饰生成器
例如 public
、private
、protected
、static
等等。这一步很简单,直接将 ReflectionFlags 转换一下即可,个人觉得使用 mutable 代码会让代码变得非常不优雅,但是有的时候还是需要用一下的,不然会极大地提高代码的复杂度。
let getModifier (flags: ReflectionFlags) = let mutable modifier = [] match flags.IsPublic with | Some flag -> if flag then modifier <- modifier |> List.append [ "public" ] else () | _ -> () match flags.IsAbstract with | Some flag -> if flag then modifier <- modifier |> List.append [ "abstract" ] else () | _ -> () match flags.IsPrivate with | Some flag -> if flag then modifier <- modifier |> List.append [ "private" ] else () | _ -> () match flags.IsProtected with | Some flag -> if flag then modifier <- modifier |> List.append [ "protected" ] else () | _ -> () match flags.IsStatic with | Some flag -> if flag then modifier <- modifier |> List.append [ "static" ] else () | _ -> () modifier
Enum 生成器
终于到 parse 实体的部分了,我们先从最简单的做起:枚举。 代码很简单,直接将原 AST 中的枚举部分转换一下即可。
let parseEnum (section: string) (node: Reflection): Entity = let values = match node.Children with | Some children -> children |> List.where (fun x -> x.Kind = ReflectionKind.EnumMember) | None -> [] { Type = EntityType.Enum; Namespace = if section = "" then "TypeDocGenerator" else section; Modifier = getModifier node.Flags; Name = node.Name Comment = match node.Comment with | Some comment -> getXmlDocComment comment | _ -> "" Methods = []; Properties = []; Events = []; InheritedFrom = []; Enums = values |> List.map (fun x -> let comment = match x.Comment with | Some comment -> getXmlDocComment comment | _ -> "" let mutable intValue = 0L match x.DefaultValue with // ????? | Some value -> if Int64.TryParse(value, &intValue) then { Comment = comment; Name = toPascalCase x.Name; Value = Some intValue; } else match getEnumReferencedValue values value x.Name with | Some t -> { Comment = comment; Name = x.Name; Value = Some (int64 t); } | _ -> { Comment = comment; Name = x.Name; Value = None; } | _ -> { Comment = comment; Name = x.Name; Value = None; } ); TypeParameter = [] }
你会注意到一个上面我有一处标了个 ?????
,这是在干什么呢?
其实,TypeScript 的 enum 是 recursive 的,也就意味着定义的时候,一个元素可以引用另一个元素,比如这样:
enum MyEnum {
A = 1,
B = 2,
C = A
}
这个时候,我们需要查找它引用的枚举值,比如在上面的例子里面,处理 C 的时候,需要将它的值 A 用真实值 1 代替。所以我们还需要一个查找函数:
let rec getEnumReferencedValue (nodes: Reflection list) value name = match nodes |> List.where(fun x -> match x.DefaultValue with | Some v -> v <> value && not (name = x.Name) | _ -> true ) |> List.where(fun x -> x.Name = value) |> List.tryFind(fun x -> let mutable intValue = 0 match x.DefaultValue with | Some y -> Int32.TryParse(y, &intValue) | _ -> true ) with | Some t -> t.DefaultValue | _ -> None
这样我们的 Enum parser 就完成了。
Interface 和 Class 生成器
下面到了重头戏,interface 和 class 才是类型绑定的关键。
我们的函数签名是这样的:
let parseInterfaceAndClass (section: string) (node: Reflection) (isInterface: bool): Entity = ...
首先我们从 Reflection 节点中查找并生成注释、修饰、名称、泛型参数、继承关系、方法、属性和事件:
let comment = match node.Comment with | Some comment -> getXmlDocComment comment | _ -> "" let exts = match node.ExtendedTypes with | Some types -> types |> List.map(fun x -> getType x) | _ -> [] let genericType = let types = match node.TypeParameter with | Some tp -> Some (getGenericTypeParameters tp) | _ -> None match types with | Some result -> result | _ -> [] let properties = match node.Children with | Some children -> if isInterface then children |> List.where(fun x -> x.Kind = ReflectionKind.Property) |> List.where(fun x -> x.InheritedFrom = None) // exclude inhreited properties |> List.where(fun x -> x.Overwrites = None) // exclude overrites properties else children |> List.where(fun x -> x.Kind = ReflectionKind.Property) | _ -> [] let events = match node.Children with | Some children -> if isInterface then children |> List.where(fun x -> x.Kind = ReflectionKind.Event) |> List.where(fun x -> x.InheritedFrom = None) // exclude inhreited events |> List.where(fun x -> x.Overwrites = None) // exclude overrites events else children |> List.where(fun x -> x.Kind = ReflectionKind.Event) | _ -> [] let methods = match node.Children with | Some children -> if isInterface then children |> List.where(fun x -> x.Kind = ReflectionKind.Method) |> List.where(fun x -> x.InheritedFrom = None) // exclude inhreited methods |> List.where(fun x -> x.Overwrites = None) // exclude overrites methods else children |> List.where(fun x -> x.Kind = ReflectionKind.Method) | _ -> []
有一点要注意,就是对于 interface 来说,子 interface 无需重复父 interface 的成员,因此需要排除。
然后我们直接返回一个 record,代表该节点的实体即可。
{ Type = if isInterface then EntityType.Interface else EntityType.Class; Namespace = if section = "" then "TypedocConverter" else section; Name = node.Name; Comment = comment; Modifier = getModifier node.Flags; InheritedFrom = exts; Methods = methods |> List.map (
请发表评论