Skip to main content

Symbol Table Design

· 4 min read

Introduction

Today I am intended to talk about the symbol table design of the forthcoming Ausar language. Here comes the first question: What is the symbol table? As referenced by Wikipedia page Symbol table,

In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier (or symbol), constant, procedure and function in a program's source code is associated with information relating to its declaration or appearance in the source.

Symbol table is an essential data structure in the implementation of compilers. The internal symbol table of Ausar provides such key capabilities as below:

  • It stores all the variables with extended informations in one place.

    • The ultimate symbol table is a multi-level tree with a sole entrance node (The top-level node).
  • It can be used to verify the redeclaration of variables.

    • A unique variable can only be declared once.
  • It can be used to determine the scope of a variable.

    • Every variable is bound to a unique scope.
  • It can be ultilized to implement the type checking to ensure the semantic correctness of source code.

    • A literal value cannot be assigned to a function variable.
    • Only variables can be declared and assigned.

Key Design

Architecture Illustration

As illustrated above, the symbol table is a multi-level tree with only a sole entrance (the toplevel scope). The sophisticated multi-level architecture is constituted with four basic levels as its skelton, respectively,

  • TopLevelScope
    • The builtin functions and packages are registered in the top level scope, which is readonly for the subsequent scopes.
  • PkgScope
    • This scope holds the packages defined by users. Every package is declared in this scope with the unique package name.
  • BasicDeclScope
    • Function is the first-class citizen with the distinct scope extending the package scope.
  • ControlFlowScope
    • This one is instantiated by the control flow statement like if else statement and for statement. Such scopes are self-recursive which indicates that it may have sub control flow scopes.
    • if a > b {
      if a > 3 {
      }
      }

The architecture resembles the prototype chain in Javascript, where the declaration operation makes side effect in its own scope, while the get & set operation can affect the scopes rooted in the same extending path.

Implementation

Data Structure

type SymbolTable struct {
p *SymbolTable
a map[string]TypedValue
}

type TypedValue struct {
IdentifierType
Value any
}

The SymbolTable preserves two basic fields including a pointer p pointing to its parent SymbleTable and a map field a which holds the varaibles declared in its domained scope.

Basic Operations

Get
func (s *SymbolTable) Get(name string) (TypedValue, error) {
if val, ok := s.a[name]; ok {
return val, nil
} else if s.p != nil {
return s.p.Get(name)
}
return TypedValue{}, fmt.Errorf("variable [%s] not declared", name)
}

The Get operatation follows a recursive and ascent strategy to look up the target variable. At first, it searches the variable in the local scope, and returns the value if it exists. If the variable does not exists, it tries to call the Get method of its parent symbol table.

Set
func (s *SymbolTable) Set(name string, iType IdentifierType, val any) error {
if curVal, ok := s.a[name]; ok {
if curVal.IdentifierType != iType {
return fmt.Errorf("iType not equal: want [%v], cur [%v]", curVal.IdentifierType, iType)
}
s.a[name] = NewTypedValue(iType, val)
return nil
} else if s.p != nil {
return s.p.Set(name, iType, val)
}
return fmt.Errorf("variable [%s] not declared", name)
}

As the same as the Get operation, the Set operation endevors to operate the local scope if the variable has been declared. Or else, it tries to trigger the Set method of its parent symbol table.

Declare
func (s *SymbolTable) Declare(name string, iType IdentifierType, value any) error {
if _, ok := s.a[name]; ok {
return fmt.Errorf("variable [%s] has declared", name)
}
s.a[name] = NewTypedValue(iType, value)
return nil
}

As illustrated above, the declaration operation could make side effects in its own scope.

Conclusion

The symbol table of the Ausar language is simple and robust, which implements the majority of the basic features of a standard symbol table.

Ausar is Comming

· 3 min read
Ausar Wong
Free Developer

Introduction

Ausar is a light-weight script language written in pure Go with a dynamic type system. Its grammars are a deliberately simplified subset of the standard Go Spec. The original intention for this repo is to implement an expression engine as the core lib for the business rule engine and formula engine.

Why I Implement this Repo?

Close Relationship with My Working Scenario

I must emphasize that this repo is not a reimplementation of so many existed expresion engine libraries. I am working in a securities company with the high concern with the NUMERICAL ACCURACY SECURITY. For instance, the execution result of expr 0.1+0.2 must be 0.3 for my working scenarios. However, the mainstream of the existed open-source libararies fail to achieve this with taking the float number as its underlying number type. For float number, the execution result is ridiculous if the arithmetic operations are repeated with tremendous times.

Every Number should Be Decimal

To solve the issue as demonstrated above, Ausar takes the Decimal as the underlying type for numbers. Every inputted number would be transformed into the decimal type. Meanwhile, with the help of the famous Go Decimal library, every arithmetic operations of numbers would be converted into the corresponding decimal methods. For instance, the expr 0.1+0.2*0.3 would be eventually transformed into Decimal(0.1).Add(Decimal(0.2).Mul(Decimal(0.3))) during the execution phase.

From an Expression Engine to a Simplified Language

The original purpose of the repo is to implement a simple expression executor for the Business rule engine. As same as the equivalent Rust version: expression_engine_rs which I used to implement, the original version only supports some basic expressions like unary, binary, ternary expression and so on. The only complicated case is the chain expression with expressions separated by the punctuation ";" indicating that the expressions can be executed sequenctially.

Things became complicated when the formula engine scenario arised. Let me show a brief example as below:

func max(a, b) {
if a > b {
return a;
};
return b;
}

wA := 0.5;
wB := 0.4;

hA := 0.6;
hB := 0.9;

// size is the multiplication result of the maximum one of wA and wB with the maximum one of hA and hB.
size := max(wA,wB)*max(hA,hB);

This secnario requires more sophisticated features like function declarations and basic control flow. Eventually, the intention upgraded to the implementation of a light-weight script language.

Enthusiasm to The Compiler Design

I love compiler design with the genuine passion to decode the string into the corresponding abstact syntax tree. I used to spend 7 to 8 months in the implementation of some well-known automated parsing algorithms, like LL, LR, Early, and PEG. Thankfully, I eventually achieved these goals. I had the aspiration to apply the knowledge into some realistic scenarios. This repo combines the techniques of handwritten parsers in my early work and automated parsers (mainly PEG) of my recent work.

Compilers: Automated vs Handwritten

· 11 min read
Ausar Wong
Free Developer

Introduction

As we all know, there are many parser-generated techniques like Yacc, Bison and ANTLR. They are either an LL Parser or an LR Parser. Such parsers are automated. A compiler designer can easily "implement" a sophisticated parser with such automation techniques on the basis of a predefined CFG or PEG grammar. These delicate techniques facilitate the process of designing and implementation of a novel programming language. However, the majority of modern languages like Go and Rust abandoned such automation techniques and switched to a handwritten candidate. Why technologies that were prosperous decades ago are not popular today?

Short Answers

I think there are three major reasons for the revival of handwritten parsers:

  • Efficiency
  • Error Tracing
  • Code Optimization

Experiment

I make an experiment here to better illustrate the three reasons listed above. The experiment is simple, deserializing a valid JSON string. Specifically, the JSON deserialization task is composed of two subtasks: Decoding data from a JSON string Bind the decoded data to a structured representation

The former subtask is generic, and all the JSON parsers need to accomplish it. While it is unnecessary to bind the decoded JSON data to tremendous types of data structures. For experiment purpose, it is enough to bind the data to a basic JSON representation. JSON Representation Definition According to the JSON specification, JSON is a data format with 6 fundamental data types, respectively, Null, Number, Boolean, String, Array, and Object. As follows, the type "JsonValue" represents a basic JSON data with six pointer fields which corresponds to the six fundamental types of the JSON format. For instance, if the instanced JsonValue indicates a "Null" JSON data, the field "Null" is assigned with the five remaining fields are null pointers.

type JsonValue struct {
Null any `json:"null,omitempty"`
Number *float64 `json:"number,omitempty"`
Str *string `json:"str,omitempty"`
Array JsonArray `json:"array,omitempty"`
Object JsonObject `json:"object,omitempty"`
Bool *bool `json:"bool,omitempty"`
}

type JsonArray []JsonValue
type JsonObject map[string]JsonValue

Handwritten Parser

At first, I implemented a handwritten JSON parser, which is a Top-Down and Recursive descent parser. Thanks to the brief JSON Spec, the whole implementation is simple with just 170 lines of code, an entrance handler with 6 branch handlers.

const (
NumberRegStr = "-?(0|[1-9][0-9]*)(.[0-9]+)?([eE][-+]?(0|[1-9][0-9]*))?"
StringRegStr = `^"([^"\\\\]*|\\\\["\\\\bfnrt\/]|\\\\u[0-9a-f]{4})*"`
)

var numberReg, _ = regexp.Compile(NumberRegStr)

var stringReg, _ = regexp.Compile(StringRegStr)

type JsonDeserializerInterface interface {
Deserialize(input string) (*JsonValue, bool)
}

type JsonDeserializer struct{}

func (j JsonDeserializer) Deserialize(input string) (*JsonValue, bool) {
value, _, ok := j.deserialize(input)
return value, ok
}

type DeserializerHandlerInterface = func(input string) (value *JsonValue, remaining string, ok bool)

func (j JsonDeserializer) deserialize(input string) (value *JsonValue, remaining string, ok bool) {
for _, handler := range []DeserializerHandlerInterface{
j.deserializeNull,
j.deserializeNumber,
j.deserializeBool,
j.deserializeString,
j.deserializeArray,
j.deserializeObject,
} {
if value, remaining, ok = handler(input); ok {
return
}
}
return nil, input, false
}

func (j JsonDeserializer) deserializeNull(input string) (*JsonValue, string, bool) {
remaining := strings.TrimSpace(input)
if strings.HasPrefix(remaining, "null") {
return &JsonValue{
Null: true,
}, remaining[4:], true
}
return nil, input, false
}

func (j JsonDeserializer) deserializeNumber(input string) (*JsonValue, string, bool) {
remaining := strings.TrimSpace(input)
loc := numberReg.FindStringIndex(remaining)
if len(loc) == 0 || loc[0] != 0 {
return nil, input, false
}
numberStr := remaining[loc[0]:loc[1]]
number, err := strconv.ParseFloat(numberStr, 64)
if err != nil {
return nil, input, false
}
return &JsonValue{
Number: &number,
}, remaining[loc[1]:], true
}

func (j JsonDeserializer) deserializeBool(input string) (*JsonValue, string, bool) {
remaining := strings.TrimSpace(input)
var value bool
if strings.HasPrefix(remaining, "true") {
value = true
return &JsonValue{
Bool: &value,
}, remaining[4:], true
} else if strings.HasPrefix(remaining, "false") {
return &JsonValue{
Bool: &value,
}, remaining[5:], true
}
return nil, input, false
}

func (j JsonDeserializer) deserializeString(input string) (*JsonValue, string, bool) {
remaining := strings.TrimSpace(input)
loc := stringReg.FindStringIndex(remaining)
if len(loc) == 0 || loc[0] != 0 {
return nil, input, false
}
s := remaining[1 : loc[1]-1]
return &JsonValue{
Str: &s,
}, remaining[loc[1]:], true
}

func (j JsonDeserializer) deserializeArray(input string) (*JsonValue, string, bool) {
var remaining string
var ok bool
if remaining, ok = j.expect(input, "["); !ok {
return nil, input, false
}
var ans []JsonValue
for {
var value *JsonValue
value, remaining, ok = j.deserialize(remaining)
if !ok {
break
}
ans = append(ans, *value)
remaining, ok = j.expect(remaining, ",")
if !ok {
break
}
}
if remaining, ok = j.expect(remaining, "]"); !ok {
return nil, input, false
}
return &JsonValue{
Array: ans,
}, remaining, true
}

func (j JsonDeserializer) deserializeObject(input string) (*JsonValue, string, bool) {
input = strings.TrimSpace(input)
var remaining string
var ok bool
if remaining, ok = j.expect(input, "{"); !ok {
return nil, input, false
}
m := make(JsonObject)
for {
var key *JsonValue
var value *JsonValue
key, remaining, ok = j.deserializeString(remaining)
if !ok {
break
}
remaining, ok = j.expect(remaining, ":")
if !ok {
return nil, input, false
}
value, remaining, ok = j.deserialize(remaining)
if !ok {
return nil, input, false
}
m[*key.Str] = *value
remaining, ok = j.expect(remaining, ",")
if !ok {
break
}
}
if remaining, ok = j.expect(remaining, "}"); !ok {
return nil, input, false
}
return &JsonValue{
Object: m,
}, remaining, true
}

func (j JsonDeserializer) expect(input string, expected string) (string, bool) {
input = strings.TrimSpace(input)
if strings.HasPrefix(input, expected) {
return input[len(expected):], true
}
return input, false
}

Auto-generated

Different with the handwritten parsers, their automated rivals need to define the formal grammar and corresponding semantic actions to accomplish the deserialization task. CFG A standard JSON CFG is as below:

Json       -> Value
Array -> LBracket ValueList RBracket | LBracket RBracket
Object -> LCurly PairList RCurly | LCurly RCurly
PairList -> Pair Comma PairList | Pair
ValueList -> Value Comma ValueList | Value
Pair -> String Colon Value
Value -> String | Number | Object | Array | True | False
LCurly -> "{"
RCurly -> "}"
LBracket -> "\["
RBracket -> "\]"
String -> "^"([^"\\\\]*|\\\\["\\\\bfnrt\/]|\\\\u[0-9a-f]{4})*" "
Number -> "-?(0|[1-9][0-9]*)(.[0-9]+)?([eE][-+]?(0|[1-9][0-9]*))?"
True -> "true"
False -> "false"
Null -> "null"
Comma -> ","
Colon -> ":"

LL

The LL Parser is a Top-Down and non-recursive parser. It is auto-generated and table-driven. Semantic Actions For every production in the grammar, we need define the corresponding semantic action.

var llJsonActions = []ll.SemanticAction[string, JsonValue]{
// 0: Array -> LBracket Array"
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[1].Variable.Value
return nil
},

// 1: Array" -> RBracket
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Array: []JsonValue{},
}
return nil
},

// 2: Array" -> ValueList RBracket
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 3: Json -> Value
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 4: Object -> LCurly Object"
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[1].Variable.Value
return nil
},

// 5: Object" -> PairList RCurly
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 6: Object" -> RCurly
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Object: map[string]JsonValue{},
}
return nil
},

// 7: Pair -> String Colon Value
func(v *tree.Variable[string, JsonValue]) error {
key := v.Children[0].Terminal.Matched
value := v.Children[2].Variable.Value
v.Value = JsonValue{
Object: map[string]JsonValue{
key: value,
},
}
return nil
},

// 8: PairList -> Pair PairList"
func(v *tree.Variable[string, JsonValue]) error {
a, b := v.Children[0].Variable.Value.Object, v.Children[1].Variable.Value.Object
for key, value := range b {
a[key] = value
}
v.Value = JsonValue{
Object: a,
}
return nil
},

// 9: PairList" -> #
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Object: map[string]JsonValue{},
}
return nil
},

// 10: PairList" -> Comma PairList
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[1].Variable.Value
return nil
},

// 11: Value -> Array
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 12: Value -> False
func(v *tree.Variable[string, JsonValue]) error {
value := false
v.Value = JsonValue{
Bool: &value,
}
return nil
},

// 13: Value -> Null
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Null: true,
}
return nil
},

// 14: Value -> Number
func(v *tree.Variable[string, JsonValue]) error {
ans, err := strconv.ParseFloat(v.Children[0].Terminal.Matched, 64)
if err != nil {
return err
}
v.Value = JsonValue{
Number: &ans,
}
return nil
},

// 15: Value -> Object
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 16: Value -> String
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Str: &v.Children[0].Terminal.Matched,
}
return nil
},

// 17: Value -> True
func(v *tree.Variable[string, JsonValue]) error {
value := true
v.Value = JsonValue{
Bool: &value,
}
return nil
},

// 18: ValueList -> Value ValueList"
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Array: append([]JsonValue{v.Children[0].Variable.Value}, v.Children[1].Variable.Value.Array...),
}
return nil
},

// 19: ValueList" -> #
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Array: []JsonValue{},
}
return nil
},

// 20: ValueList" -> Comma ValueList
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[1].Variable.Value
return nil
},
}

LR

The LR Parser is a Bottom-Up and non-recursive parser. As the same as the LL Parser, it is auto-generated and table-driven too. Semantic Actions For every production in the grammar, we need define the corresponding semantic action.

var lrJsonActions = []lr.SemanticAction[string, JsonValue]{
// 0: Json' -> Json
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 1: Json -> Value
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 2: Array -> LBracket ValueList RBracket
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[1].Variable.Value
return nil
},

// 3: Array -> LBracket RBracket
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Array: make(JsonArray, 0),
}
return nil
},

// 4: Object -> LCurly PairList RCurly
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[1].Variable.Value
return nil
},

// 5: Object -> LCurly RCurly
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Object: make(JsonObject),
}
return nil
},

// 6: PairList -> Pair Comma PairList
func(v *tree.Variable[string, JsonValue]) error {
a, b := v.Children[0].Variable.Value.Object, v.Children[2].Variable.Value.Object
for key, value := range b {
a[key] = value
}
v.Value = JsonValue{
Object: a,
}
return nil
},

// 7: PairList -> Pair
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 8: ValueList -> Value Comma ValueList
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Array: append([]JsonValue{v.Children[0].Variable.Value}, v.Children[2].Variable.Value.Array...),
}
return nil
},

// 9: ValueList -> Value
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Array: []JsonValue{v.Children[0].Variable.Value},
}
return nil
},

// 10: Pair -> String Colon Value
func(v *tree.Variable[string, JsonValue]) error {
key := v.Children[0].Terminal.Matched
value := v.Children[2].Variable.Value
v.Value = JsonValue{
Object: map[string]JsonValue{
key: value,
},
}
return nil
},

// 11: Value -> String
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Str: &v.Children[0].Terminal.Matched,
}
return nil
},

// 12: Value -> Number
func(v *tree.Variable[string, JsonValue]) error {
ans, err := strconv.ParseFloat(v.Children[0].Terminal.Matched, 64)
if err != nil {
return err
}
v.Value = JsonValue{
Number: &ans,
}
return nil
},

// 13: Value -> Object
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 14: Value -> Array
func(v *tree.Variable[string, JsonValue]) error {
v.Value = v.Children[0].Variable.Value
return nil
},

// 15: Value -> True
func(v *tree.Variable[string, JsonValue]) error {
value := true
v.Value = JsonValue{
Bool: &value,
}
return nil
},

// 16: Value -> False
func(v *tree.Variable[string, JsonValue]) error {
value := false
v.Value = JsonValue{
Bool: &value,
}
return nil
},

// 17: Value -> Null
func(v *tree.Variable[string, JsonValue]) error {
v.Value = JsonValue{
Null: true,
}
return nil
},
}

Comparison

Efficiency

The performance benchmark highly corresponds to the parsing capability. I conduct an experiment here to test the performance of different parsers. The test data is as below:

// Here is a brief JSON data
{
"haha":"haha",
"dd": 23,
"d": true,
"c": null,
"extended": {"haha":"haha", "dd": 23, "d": true, "c": null},
"list": [1 , 2, false]
}

image

Benchmark for Handwritten Parser

image

Benchmark for Auto-generated LL Parser

image

Benchmark for Auto-generated LR Parser

The above pictures show that the handwritten parser performs much better than the auto-generated ones with 19846 ns/op and 78 allocs/op which are 10 times faster and more space-efficient than the latter.

Error Tracing & Code Optimization

Error tracing and code optimization play a vital role in the implementation of a robust and stable parser. As illustrated above, the whole parsing process can be classified into two sub processes: a) decoding; b) binding. For handwritten parsers, the designers can debug and optimize every piece of code. However, it's impossible for auto-generated ones to accomplish such sufficient code debugging and optimization. The runtime of the latter ones is not accessible for the implementers, which means they can only manipulate the code in sub process b): binding. It's hard for you to optimize the runtime for a specific scenario, except when you implement the runtime by yourself. Ultimately, you switch to the handwritten parsers.

Conclusion

This post provides a brief but pretty detailed explanation of the popularity of the handwritten parsers, which is a result of the pursuit for better efficiency, easier error tracing and code optimization.

Golang 泛型简单介绍

· 12 min read

背景(Background)

Golang于1.18版本引入泛型,本文对Golang泛型及其使用方法做简单介绍。本文主要内容、示例来自官方文档Type Parameters Proposal,可看作原文的一篇翻译文,有条件的建议阅读原文。

摘要(Abstract)

Golang泛型的主要内容如下:

  1. 函数支持额外的类型参数,函数的类型参数定义在[]内,泛型函数形如func F[T any] (t T) {}
  2. 函数的类型参数可以在函数的参数以及函数体中使用
  3. 类型定义也支持额外的类型参数,形如type M[T any] []T
  4. 每一个类型参数都有对应的约束(Constraint):func F[T Constraint](t T) {}
  5. 类型约束(Type constraint)是接口类型(interface type)的扩展
  6. any关键字是新引入的类型约束,代表任意数据类型,如同interface{}一样
  7. 接口类型用作类型约束可以嵌入额外的元素来限制满足类型约束的类型参数(可以理解为类型约束本身就是一个递归嵌套结构):
    1. 任意类型参数T
    2. ~T用于限制类型参数的底层数据类型是T
    3. T1 | T2 | ...限制类型参数可以是T1或者T2或者列表中任意一个
        type NumberConstraint interface {
    int | int8 | int16 | int32 | int64 | uint | uint8 | uint16 | uint32 | uint64
    String() string
    }
    // 这里定义了一个数字约束,它可以是Golang中的任意整数类型,且同时支持String方法
    // 实际上这是一个空类型约束
  8. 泛型函数只能使用所有符合类型约束的类型参数都支持的操作
  9. 使用泛型函数或者泛型类型时必须传入具体的类型参数
  10. Golang支持类型推测,所以实际使用过程中,并不一定需要传入具体的类型参数
    1. 现代很多语言如Rust都支持类型推测,编译器会根据传参推测泛型的具体类型

类型参数(Type Parameters)

Golang对类型声明进行了扩展,使之支持额外的类型参数:

// Vector代表任意数据类型对应的切片类型,它接受额外的类型参数T
type Vector[T any] []T

// v是一个int切片,类型参数T的传参是int
var v Vector[int]

// b是一个uint切片,类型参数T的传参是uint
var b Vector[uint]

// d是一个string切片,类型参数T的传参是string
var d Vector[string]

与之相对应的,函数声明也支持额外的类型参数:

// 声明Stringify是一个泛型函数,T是它的类型参数
func Stringify[T any] (s []T) (ret []string) {
for _, v := range s {
ret = append(ret, v.String()) // 实际上不合法,并不是所有的类型都支持String方法
}
return ret
}

多类型参数(Multiple Type Parameters)

Golang泛型支持多类型参数,泛型类型和泛型函数的类型参数可以是多个。

// 这里列举的是两个类型参数的情况,实际上类型参数可以有更多个

// 泛型函数Print2有两个类型参数T1和T2,及两个非类型参数s1和s2
func Print2[T1, T2 any](s1 []T1, s2T2) {...}

// 泛型类型Test有两个类型参数T1和T2
type Test[T1, T2 any] struct {
T1 T1
T2 T2
}

类型约束(Constraints)

Golang中实际上已经实现了类型约束概念相似的类型:interface类型。interface类型定义为一个方法列表, 只有都实现了interface中定义方法的数据类型才可以赋值给interface类型。

// 这里定义了一个Person interface类型,Person类型必须实现两个方法:
// 1. Name方法
// 2. Gender方法
type Person interface{
Name() string
Gender() string
}

type Male struct {
name string
}

type Female struct {
name string
}

func (p Male) Name() string {
return p.Name
}

func (p Male) Gender() string {
return "male"
}

func (f Female) Name() string {
return f.Name
}

func (f Female) Gender() string {
return "female"
}
// Name函数和Gender函数的入参都是Person接口,Male和Female类型都可以作为入参参数
func Name(p Person) string {
return p.Name()
}

func Gender(p Person) string {
return p.Gender()
}

Golang扩展了接口类型,使之可以作为类型约束。使用类型参数调用泛型函数的过程类似于赋值给接口类型:该类型参数必须实现类型约束。

// 这里定义了一个类型约束Stringer,要求类型必须实现String方法
type Stringer interface {
String() string
}

// 类型参数T必须满足类型约束Stringer
func PrintList[T Stringer] (list []T) []string {
ans := make([]T, 0, len(list))
for _, item := range list {
ans = append(ans, item.String())
}
return ans
}

同理,使用类型参数初始化泛型类型时,该类型参数也需要满足对应的类型约束。

// 类型参数T需要满足类型约束Person
type Human[T Person] struct {
Person T
}

any关键字

any关键字代表任意类型,等价于于golang中的空接口interface{}

// Vector1和Vector2是等价的,都代表任意数据类型的列表类型
type Vector1[T any] []T // T的类型约束是any
type Vector2[T interface{}] []T // T的类型约束是interface{}

数据类型约束

类型约束中可以定义满足约束的数据类型。

// Number约束的数据类型为int|int8|uint
type Number interface {
int|int8|uint
}

底层数据类型约束

    type Number interface {
int|int8|uint
}

func Min[T Number](a, b T) T {
if a < b {
return a
}
return b
}

// Int类型的底层数据类型为int
type Int int

func main() {
var a Int = 10
var b Int = 50
b := Min(a, b) // invalid,这里a不满足Number类型约束,因为它的数据类型不是int|int8|uint
}

上面这段代码在Go编译器中会报错,因为Int类型不满足Number约束,Int数据类型不属于int|int8|uint之一。 然而Int类型的底层数据类型是int,支持int类型的Min函数理论上也应该对Int类型有效。 针对这类问题,Golang使用了~关键字,它可以声明底层数据类型约束。

// 修改为底层数据类型约束,上面的问题就迎刃而解了
type Number interface {
~int|~int8|~uint
}

约束继承

作为对interface类型的扩展,类型约束本身也支持继承。

type Talk interface {
Talk()
}

type Run interface {
Run()
}

type Jump interface {
Jump()
}

// Boy约束必须同时满足Talk,Run和Jump约束
type Boy interface {
Talk
Run
Jump
}

泛型类型(Generic Type)

泛型类型支持额外的类型参数,之后的类型定义中可以引用定义好的类型参数。

// 泛型类型Test支持两个类型参数T1和T2,它有两个字段分别是T1类型的A1和T2类型的A2
type Test[T1, T2 any] struct {
A1 T1 // 这里引用了类型参数T1
A2 T2 // 这里引用了类型参数T2
}

类型方法(Type Method)

不同于泛型函数,泛型类型的方法只支持泛型类型定义中声明的类型参数(这是不同于Rust的地方)。

// 泛型类型Test的Print方法中可以引用Test中定义好的类型参数T1和T2
func (t Test[T1, T2 any]) Print(a T1, b T2) {}

// 这种类型方法支持额外的类型参数的写法是不合法的
func (t Test[T1, T2 any]) Print[T3 any] {} // invalid

泛型函数(Generic Function)

泛型函数支持额外的类型参数,函数参数和函数体中可以引用定义好的类型参数。

type Number interface {
int|int8|int16|int32|int64|uint|uint8|uint16|uint32|uint64
}

// 泛型函数Min,支持任意整形数据的二者求较小值操作
func Min[T Number] (a, b T) Number {
if a < b {
return a
}
return b
}

func main() {
var a int = 30
var b int = 40
c := Min[int](a, b)
var d uint32 = 30
var e uint32 = 70
f := Min[uint32](d, e)
}

类型推测(Type Inference)

作为一种现代语言,Golang支持类型推测,即在初始化泛型类型或者调用泛型函数时,并不一定需要传入类型参数, 编译器会根据传参自动推测对应的类型参数。

type Vector[T any] []T

func Copy[T any](from T) T {}

// 在实例化泛型类型Vector和调用泛型函数Copy的过程中,并不需要传入具体的类型参数
// 编译器会根据传入的非类型参数自动推测对应的类型参数
func main() {
a := Vector{int64{1}} // a的类型为[]int64
b := Vector{uint32{2}} // b的类型为[]uint32
c := Copy(a) // 自动推测类型参数T是[]int64
d := Copy(b) // 自动推测类型参数T是[]uint32
}