📝 finish translation of README.md

This commit is contained in:
ValKmjolnir 2022-06-18 18:48:00 +08:00
parent abbdb22478
commit 9114ecd820
1 changed files with 105 additions and 188 deletions

View File

@ -52,21 +52,21 @@
* [v8.0](#version-80-vm-last-update-2022212) * [v8.0](#version-80-vm-last-update-2022212)
* [v9.0](#version-90-vm-last-update-2022518) * [v9.0](#version-90-vm-last-update-2022518)
* [v10.0](#version-100-vm-latest) * [v10.0](#version-100-vm-latest)
* [__测试数据__](#benchmark) * [__测试数据__](#测试数据)
* [v6.5 (i5-8250U windows 10)](#version-65-i5-8250u-windows10-2021619) * [v6.5 (i5-8250U windows 10)](#version-65-i5-8250u-windows10-2021619)
* [v6.5 (i5-8250U ubuntu-WSL)](#version-70-i5-8250u-ubuntu-wsl-on-windows10-2021629) * [v6.5 (i5-8250U ubuntu-WSL)](#version-70-i5-8250u-ubuntu-wsl-on-windows10-2021629)
* [v8.0 (R9-5900HX ubuntu-WSL)](#version-80-r9-5900hx-ubuntu-wsl-2022123) * [v8.0 (R9-5900HX ubuntu-WSL)](#version-80-r9-5900hx-ubuntu-wsl-2022123)
* [v9.0 (R9-5900HX ubuntu-WSL)](#version-90-r9-5900hx-ubuntu-wsl-2022213) * [v9.0 (R9-5900HX ubuntu-WSL)](#version-90-r9-5900hx-ubuntu-wsl-2022213)
* [__特性__](#difference-between-andys-and-this-interpreter) * [__特殊之处__](#与andy解释器的不同之处)
* [严格的定义要求](#1-must-use-var-to-define-variables) * [严格的定义要求](#1-必须用var定义变量)
* [(已过时)在定义后调用变量](#2-now-supported-couldnt-use-variables-before-definitions) * [(已过时)在定义后调用变量](#2-现在已经支持-不能在定义前使用变量)
* [缺省(默认)参数](#3-default-dynamic-arguments-not-supported) * [默认不定长参数](#3-默认不定长参数)
* [__堆栈追踪信息__](#trace-back-info) * [__堆栈追踪信息__](#trace-back-info)
* [内置函数 'die'](#1-native-function-die) * [内置函数`die`](#1-内置函数die)
* [栈溢出](#2-stack-overflow-crash-info) * [栈溢出](#2-栈溢出信息)
* [运行时错误](#3-normal-vm-error-crash-info) * [运行时错误](#3-运行时错误)
* [详细的崩溃信息](#4-detailed-crash-info) * [详细的崩溃信息](#4-详细的崩溃信息)
* [__调试器__](#debugger) * [__调试器__](#调试器)
__如果有好的意见或建议欢迎联系我们!__ __如果有好的意见或建议欢迎联系我们!__
@ -950,27 +950,19 @@ for(var i=0;i<4000000;i+=1);
2021/5/31 update: 2021/5/31 update:
Now gc can collect garbage correctly without re-collecting, 现在垃圾收集器不会错误地重复收集未使用变量了。
which will cause fatal error.
Add `builtin_alloc` to avoid mark-sweep when running a built-in function, 添加了`builtin_alloc`以防止在运行内置函数的时候错误触发标记清除。
which will mark useful items as useless garbage to collect.
Better use setsize and assignment to get a big array, 建议在获取大空间数组的时候尽量使用setsize因为`append`在被频繁调用时可能会频繁触发垃圾收集器。
`append` is very slow in this situation.
2021/6/3 update: 2021/6/3 update:
Fixed a bug that gc still re-collects garbage, 修复了垃圾收集器还是他妈的会重复收集的bug这次我设计了三个标记状态来保证垃圾是被正确收集了。
this time i use three mark states to make sure garbage is ready to be collected.
Change `callf` to `callfv` and `callfh`. 将`callf`指令拆分为`callfv`和`callfh`。并且`callfv`将直接从`val_stack`获取传参,而不是先通过一个`vm_vec`把参数收集起来再传入,后者是非常低效的做法。
And `callfv` fetches arguments from `val_stack` directly instead of using `vm_vec`,
a not very efficient way.
Better use `callfv` instead of `callfh`, 建议更多使用`callfv`而不是`callfh`,因为`callfh`只能从栈上获取参数并整合为`vm_hash`之后才能传给该指令进行处理,拖慢执行速度。
`callfh` will fetch a `vm_hash` from stack and parse it,
making this process slow.
```javascript ```javascript
var f=func(x,y){return x+y;} var f=func(x,y){return x+y;}
@ -1002,9 +994,9 @@ f(1024,2048);
0x00000011: nop 0x00000000 0x00000011: nop 0x00000000
``` ```
2021/6/21 update: Now gc will not collect nullptr. 2021/6/21 update:
And the function of assignment is complete,
now these kinds of assignment is allowed: 现在垃圾收集器不会收集空指针了。并且调用链中含有函数调用的赋值语句现在也可以执行了,下面这些赋值方式是合法的:
```javascript ```javascript
var f=func() var f=func()
@ -1021,15 +1013,9 @@ m(0)._=m(1)._=10;
[0,1,2][1:2][0]=0; [0,1,2][1:2][0]=0;
``` ```
In the old version, 在老版本中,语法分析器会检查左值,并且在检测到有特别调用的情况下直接告知用户这种左值是不被接受的(bad lvalue)。但是现在它可以正常运作了。为了保证这种赋值语句能正常执行codegen模块会优先使用`nasal_codegen::call_gen()`生成前面调用链的字节码而不是全部使用 `nasal_codegen::mcall_gen()`,在最后一个调用处才会使用`nasal_codegen::mcall_gen()`。
parser will check this left-value and tells that these kinds of left-value are not allowed(bad lvalue).
But now it can work. 所以现在生成的相关字节码也完全不同了:
And you could see its use by reading the code above.
To make sure this assignment works correctly,
codegen will generate byte code by `nasal_codegen::call_gen()` instead of `nasal_codegen::mcall_gen()`,
and the last child of the ast will be generated by `nasal_codegen::mcall_gen()`.
So the bytecode is totally different now:
```x86asm ```x86asm
.number 10 .number 10
@ -1092,42 +1078,27 @@ So the bytecode is totally different now:
0x00000035: nop 0x00000000 0x00000035: nop 0x00000000
``` ```
As you could see from the bytecode above, 从上面这些字节码可以看出,`mcall`/`mcallv`/`mcallh`指令的使用频率比以前减小了一些,而`call`/`callv`/`callh`/`callfv`/`callfh`则相反。
`mcall`/`mcallv`/`mcallh` operands' using frequency will reduce,
`call`/`callv`/`callh`/`callfv`/`callfh` at the opposite.
And because of the new structure of `mcall`, 并且因为新的数据结构,`mcall`指令以及`addr_stack`,一个曾用来存储指针的栈,从`nasal_vm`中被移除。现在`nasal_vm`使用`nasal_val** mem_addr`来暂存获取的内存地址。这不会导致严重的问题,因为内存空间是 __获取即使用__ 的。
`addr_stack`, a stack used to store the memory address,
is deleted from `nasal_vm`,
and now `nasal_vm` use `nasal_val** mem_addr` to store the memory address.
This will not cause fatal errors because the memory address is used __immediately__ after getting it.
### version 7.0 vm (last update 2021/10/8) ### version 7.0 vm (last update 2021/10/8)
2021/6/26 update: 2021/6/26 update:
Instruction dispatch is changed from call-threading to computed-goto(with inline function). 指令分派方式从call-threading改为了computed-goto。在更改了指令分派方式之后nasal_vm的执行效率有了非常巨大的提升。现在虚拟机可以在0.2秒内执行完test/bigloop和test/pi并且在linux平台虚拟机可以在0.8秒内执行完test/fib。你可以在下面的测试数据部分看到测试的结果。
After changing the way of instruction dispatch,
there is a great improvement in nasal_vm.
Now vm can run test/bigloop and test/pi in 0.2s!
And vm runs test/fib in 0.8s on linux.
You could see the time use data below,
in Test data section.
This version uses g++ extension "labels as values", 这个分派方式使用了g++扩展"labels as values"clang++目前也支持这种指令分派的实现方式。(不过MSVC支不支持就不得而知了哈哈)
which is also supported by clang++.
(But i don't know if MSVC supports this)
There is also a change in nasal_gc: nasal_gc中也有部分改动:
`std::vector` global is deleted, 全局变量不再用`std::vector`存储,而是全部存在操作数栈上(从`val_stack+0`到`val_stack+intg-1`)。
now the global values are all stored on stack(from `val_stack+0` to `val_stack+intg-1`).
2021/6/29 update: 2021/6/29 update:
Add some instructions that execute const values: 添加了一些直接用常量进行运算的指令:
`op_addc`,`op_subc`,`op_mulc`,`op_divc`,`op_lnkc`,`op_addeqc`,`op_subeqc`,`op_muleqc`,`op_diveqc`,`op_lnkeqc`. `op_addc`,`op_subc`,`op_mulc`,`op_divc`,`op_lnkc`,`op_addeqc`,`op_subeqc`,`op_muleqc`,`op_diveqc`,`op_lnkeqc`
Now the bytecode of test/bigloop.nas seems like this: 现在test/bigloop.nas的字节码是这样的:
```x86asm ```x86asm
.number 4e+006 .number 4e+006
@ -1146,12 +1117,9 @@ Now the bytecode of test/bigloop.nas seems like this:
0x0000000b: nop 0x00000000 0x0000000b: nop 0x00000000
``` ```
And this test file runs in 0.1s after this update. 在这次更新之后这个测试文件可以在0.1秒内运行结束。大多数的运算操作速度都有提升。
Most of the calculations are accelerated.
Also, assignment bytecode has changed a lot. 并且赋值相关的字节码也有一些改动。现在赋值语句只包含一个标识符时,会优先调用`op_load`来赋值,而不是使用`op_meq`和`op_pop`。
Now the first identifier that called in assignment will use `op_load` to assign,
instead of `op_meq`,`op_pop`.
```javascript ```javascript
var (a,b)=(1,2); var (a,b)=(1,2);
@ -1176,20 +1144,15 @@ a=b=0;
2021/10/8 update: 2021/10/8 update:
In this version vm_nil and vm_num now is not managed by `nasal_gc`, 从这个版本开始`vm_nil`和`vm_num`不再由`nasal_gc`管理,这会大幅度降低`gc::alloc`的调用并且会大幅度提升执行效率。
this will decrease the usage of `gc::alloc` and increase the efficiency of execution.
New value type is added: `vm_obj`. 添加了新的数据类型: `vm_obj`。这个类型是留给用户定义他们想要的数据类型的。相关的API会在未来加入。
This type is reserved for user to define their own value types.
Related API will be added in the future.
Fully functional closure: 功能完备的闭包:添加了读写闭包数据的指令。删除了老的指令`op_offset`。
Add new operands that get and set upvalues.
Delete an old operand `op_offset`.
2021/10/13 update: 2021/10/13 update:
The format of output information of bytecodes changes to this: 字节码信息输出格式修改为如下形式:
```x86asm ```x86asm
0x000002e6: newf 0x2ea 0x000002e6: newf 0x2ea
@ -1234,30 +1197,26 @@ The format of output information of bytecodes changes to this:
2022/1/22 update: 2022/1/22 update:
Delete `op_pone` and `op_pzero`. 删除`op_pone`和`op_pzero`。这两个指令在目前已经没有实际意义,并且已经被`op_pnum`替代。
Both of them are meaningless and will be replaced by `op_pnum`.
### version 9.0 vm (last update 2022/5/18) ### version 9.0 vm (last update 2022/5/18)
2022/2/12 update: 2022/2/12 update:
Local values now are __stored on stack__. 局部变量现在也被 __存储在栈上__
So function calling will be faster than before. 所以函数调用比以前也会快速很多。
Because in v8.0 when calling a function, 在v8.0如果你想调用一个函数,
new `vm_vec` will be allocated by `nasal_gc`, this makes gc doing mark-sweep too many times and spends a quite lot of time. 新的`vm_vec`将被分配出来用于模拟局部作用域,这个操作会导致标记清除过程会被频繁触发并且浪费太多的执行时间。
In test file `test/bf.nas`, it takes too much time to test the file because this file has too many function calls(see test data below in table `version 8.0 (R9-5900HX ubuntu-WSL 2022/1/23)`). 在测试文件`test/bf.nas`中,这种调用方式使得大部分时间都被浪费了,因为这个测试文件包含大量且频繁的函数调用(详细数据请看测试数据一节中`version 8.0 (R9-5900HX ubuntu-WSL 2022/1/23)`)。
Upvalue now is generated when creating first new function in the local scope, using `vm_vec`. 现在闭包会在第一次在局部作用域创建新函数的时候产生,使用`vm_vec`。
And after that when creating new functions, they share the same upvalue, and the upvalue will synchronize with the local scope each time creating a new function. 在那之后如果再创建新的函数,则他们会共享同一个闭包,这些闭包会在每次于局部作用域创建新函数时同步。
2022/3/27 update: 2022/3/27 update:
In this month's updates we change upvalue from `vm_vec` to `vm_upval`, 在这个月的更新中我们把闭包的数据结构从`vm_vec`换成了一个新的对象`vm_upval`,这种类型有着和另外一款编程语言 __`Lua`__ 中闭包相类似的结构。
a special gc-managed object,
which has almost the same structure of that upvalue object in another programming language __`Lua`__.
Today we change the output format of bytecode. 同时我们也修改了字节码的输出格式。新的格式看起来像是 `objdump`:
New output format looks like `objdump`:
```x86asm ```x86asm
0x0000029b: 0a 00 00 00 00 newh 0x0000029b: 0a 00 00 00 00 newh
@ -1338,7 +1297,7 @@ func <0x2c4>:
2022/5/19 update: 2022/5/19 update:
Now we add coroutine in this runtime: 在这个版本中我们给nasal加入了协程:
```javascript ```javascript
var coroutine={ var coroutine={
@ -1350,28 +1309,24 @@ var coroutine={
}; };
``` ```
`coroutine.create` is used to create a new coroutine object using a function. `coroutine.create`用于创建新的协程对象。不过创建之后协程并不会直接运行。
But this coroutine will not run immediately.
`coroutine.resume` is used to continue running a coroutine. `coroutine.resume`用于继续运行一个协程。
`coroutine.yield` is used to interrupt the running of a coroutine and throw some values. `coroutine.yield`用于中断一个协程的运行过程并且抛出一些数据。这些数据会被`coroutine.resume`接收并返回。而在协程函数中`coroutine.yield`本身只返回`vm_nil`。
These values will be accepted and returned by `coroutine.resume`.
And `coroutine.yield` it self returns `vm_nil` in the coroutine function.
`coroutine.status` is used to see the status of a coroutine. `coroutine.status`用于查看协程的状态。协程有三种不同的状态:`suspended`挂起,`running`运行中,`dead`结束运行。
There are 3 types of status:`suspended` means waiting for running,`running` means is running,`dead` means finished running.
`coroutine.running` is used to judge if there is a coroutine running now. `coroutine.running`用于判断当前是否有协程正在运行。
__CAUTION:__ coroutine should not be created or running inside another coroutine. __注意:__ 协程不能在其他正在运行的协程中创建。
__We will explain how resume and yield work here:__ __接下来我们解释这个协程的运行原理:__
When `op_callb` is called, the stack frame is like this: 当`op_callb`被执行时,栈帧如下所示:
```C++ ```C++
+----------------------------+(main stack) +----------------------------+(主操作数栈)
| old pc(vm_ret) | <- top[0] | old pc(vm_ret) | <- top[0]
+----------------------------+ +----------------------------+
| old localr(vm_addr) | <- top[-1] | old localr(vm_addr) | <- top[-1]
@ -1385,10 +1340,10 @@ When `op_callb` is called, the stack frame is like this:
+----------------------------+ +----------------------------+
``` ```
In `op_callb`'s progress, next step the stack frame is: 在`op_callb`执行过程中,下一步的栈帧如下:
```C++ ```C++
+----------------------------+(main stack) +----------------------------+(主操作数栈)
| nil(vm_nil) | <- push nil | nil(vm_nil) | <- push nil
+----------------------------+ +----------------------------+
| old pc(vm_ret) | | old pc(vm_ret) |
@ -1404,25 +1359,20 @@ In `op_callb`'s progress, next step the stack frame is:
+----------------------------+ +----------------------------+
``` ```
Then we call `resume`, this function will change stack. 接着我们调用`resume`,这个函数会替换操作数栈。我们会看到,协程的操作数栈上已经保存了一些数据,但是我们首次进入协程执行时,这个操作数栈的栈顶将会是`vm_ret`,并且返回的`pc`值是`0`。
As we can see, coroutine stack already has some values on it,
but if we first enter it, the stack top will be `vm_ret`, and the return `pc` is `0`.
So for safe running, `resume` will return `gc.top[0]`. 为了保证栈顶的数据不会被破坏,`resume`会返回`gc.top[0]`。`op_callb`将会执行`top[0]=resume()`,所以栈顶的数据虽然被覆盖了一次,但是实际上还是原来的数据。
`op_callb` will do `top[0]=resume()`, so the value does not change.
```C++ ```C++
+----------------------------+(coroutine stack) +----------------------------+(协程操作数栈)
| pc:0(vm_ret) | <- now gc.top[0] | pc:0(vm_ret) | <- now gc.top[0]
+----------------------------+ +----------------------------+
``` ```
When we call `yield`, the function will do like this. 当我们调用`yield`的时候,该函数会执行出这个情况,我们发现`op_callb` 已经把`nil`放在的栈顶。但是应该返回的`local[1]`到底发送到哪里去了?
And we find that `op_callb` has put the `nil` at the top.
but where is the returned `local[1]` sent?
```C++ ```C++
+----------------------------+(coroutine stack) +----------------------------+(协程操作数栈)
| nil(vm_nil) | <- push nil | nil(vm_nil) | <- push nil
+----------------------------+ +----------------------------+
| old pc(vm_ret) | | old pc(vm_ret) |
@ -1438,11 +1388,10 @@ but where is the returned `local[1]` sent?
+----------------------------+ +----------------------------+
``` ```
When `builtin_coyield` is finished, the stack is set to main stack, 当`builtin_coyield`执行完毕之后,栈又切换到了主操作数栈上,这时可以看到返回的`local[1]`实际上被`op_callb`放在了这里的栈顶:
and the returned `local[1]` in fact is set to the top of the main stack by `op_callb`:
```C++ ```C++
+----------------------------+(main stack) +----------------------------+(主操作数栈)
| return_value(nasal_ref) | | return_value(nasal_ref) |
+----------------------------+ +----------------------------+
| old pc(vm_ret) | | old pc(vm_ret) |
@ -1458,22 +1407,20 @@ and the returned `local[1]` in fact is set to the top of the main stack by `op_c
+----------------------------+ +----------------------------+
``` ```
so the main progress feels the value on the top is the returned value of `resume`. 所以主程序会认为顶部这个返回值好像是`resume`返回的。而实际上`resume`的返回值在协程的操作数栈顶。综上所述:
but in fact the `resume`'s returned value is set on coroutine stack.
so we conclude this:
```C++ ```C++
resume (main->coroutine) return coroutine.top[0]. coroutine.top[0] = coroutine.top[0]; resume (main->coroutine) return coroutine.top[0]. coroutine.top[0] = coroutine.top[0];
yield (coroutine->main) return a vector. main.top[0] = vector; yield (coroutine->main) return a vector. main.top[0] = vector;
``` ```
## Benchmark ## 测试数据
![benchmark](../pic/benchmark.png) ![benchmark](../pic/benchmark.png)
### version 6.5 (i5-8250U windows10 2021/6/19) ### version 6.5 (i5-8250U windows10 2021/6/19)
running time and gc time: 执行时间以及垃圾收集器占用时间:
|file|call gc|total time|gc time| |file|call gc|total time|gc time|
|:----|:----|:----|:----| |:----|:----|:----|:----|
@ -1488,7 +1435,7 @@ running time and gc time:
|quick_sort.nas|2768|0.107s|0s| |quick_sort.nas|2768|0.107s|0s|
|bfs.nas|2471|1.763s|0.003s| |bfs.nas|2471|1.763s|0.003s|
operands calling frequency: 指令调用频率:
|file|1st|2nd|3rd|4th|5th| |file|1st|2nd|3rd|4th|5th|
|:----|:----|:----|:----|:----|:----| |:----|:----|:----|:----|:----|:----|
@ -1503,7 +1450,7 @@ operands calling frequency:
|quick_sort.nas|calll|pop|jt|jf|less| |quick_sort.nas|calll|pop|jt|jf|less|
|bfs.nas|calll|pop|callv|mcalll|jf| |bfs.nas|calll|pop|callv|mcalll|jf|
operands calling total times: 指令总调用数:
|file|1st|2nd|3rd|4th|5th| |file|1st|2nd|3rd|4th|5th|
|:----|:----|:----|:----|:----|:----| |:----|:----|:----|:----|:----|:----|
@ -1520,7 +1467,7 @@ operands calling total times:
### version 7.0 (i5-8250U ubuntu-WSL on windows10 2021/6/29) ### version 7.0 (i5-8250U ubuntu-WSL on windows10 2021/6/29)
running time: 执行时间:
|file|total time|info| |file|total time|info|
|:----|:----|:----| |:----|:----|:----|
@ -1537,7 +1484,7 @@ running time:
### version 8.0 (R9-5900HX ubuntu-WSL 2022/1/23) ### version 8.0 (R9-5900HX ubuntu-WSL 2022/1/23)
running time: 执行时间:
|file|total time|info| |file|total time|info|
|:----|:----|:----| |:----|:----|:----|
@ -1556,7 +1503,7 @@ running time:
### version 9.0 (R9-5900HX ubuntu-WSL 2022/2/13) ### version 9.0 (R9-5900HX ubuntu-WSL 2022/2/13)
running time: 执行时间:
|file|total time|info| |file|total time|info|
|:----|:----|:----| |:----|:----|:----|
@ -1573,20 +1520,19 @@ running time:
|mandelbrot.nas|0.0156s|| |mandelbrot.nas|0.0156s||
|ascii-art.nas|0s|| |ascii-art.nas|0s||
`bf.nas` is a very interesting test file that there is a brainfuck interpreter written in nasal. `bf.nas`是个非常有意思的测试文件我们用nasal在这个文件里实现了一个brainfuck解释器并且用这个解释器绘制了一个曼德勃罗集合。
And we use this bf interpreter to draw a mandelbrot set.
In 2022/2/17 update we added `\e` into the lexer. And the `bfcolored.nas` uses this special ASCII code. Here is the result: 在2022/2/17更新中我们给词法分析器添加了对`\e`的识别逻辑。这样 `bfcolored.nas`可以使用特别的ASCII操作字符来绘制彩色的曼德勃罗集合:
![mandelbrot](../pic/mandelbrot.png) ![mandelbrot](../pic/mandelbrot.png)
## __Difference Between Andy's and This Interpreter__ ## __与andy解释器的不同之处__
### 1. must use `var` to define variables ### 1. 必须用`var`定义变量
This interpreter uses more strict syntax to make sure it is easier for you to program and debug. 这个解释器使用了更加严格的语法检查来保证你可以更轻松地debug。这是非常有必要的严格否则debug会非常痛苦。
In Andy's interpreter: 在Andy的解释器中:
```javascript ```javascript
import("lib.nas"); import("lib.nas");
@ -1594,17 +1540,9 @@ foreach(i;[0,1,2,3])
print(i) print(i)
``` ```
This program can run normally with output 0 1 2 3. 这个程序会正常输出`0 1 2 3`。然而这个`i`标识符实际上在这里是被第一次定义,而且没有使用`var`。我认为这样的设计很容易让使用者迷惑。很有可能很多使用者都没有发现这里实际上是第一次定义`i`的地方。没有使用`var`的定义会让程序员认为这个`i`也许是在别的地方定义的。
But take a look at the iterator 'i',
this symbol is defined in foreach without using keyword 'var'.
I think this design will make programmers filling confused.
This is ambiguous that programmers maybe difficult to find the 'i' is defined here.
Without 'var', programmers may think this 'i' is defined anywhere else.
So in this new interpreter i use a more strict syntax to force users to use 'var' to define iterator of forindex and foreach. 所以在这个新的解释器中,我直接使用严格的语法检查方法来强行要求用户必须要使用`var`来定义新的变量或者迭代器。如果你忘了加这个关键字,并且你没有在别的地方声明过这个变量,那么你就会得到这个:
If you forget to add the keyword 'var',
and you haven't defined this symbol before,
you will get this:
```javascript ```javascript
[code] test.nas:2 undefined symbol "i". [code] test.nas:2 undefined symbol "i".
@ -1613,10 +1551,9 @@ foreach(i;[0,1,2,3])
print(i) print(i)
``` ```
### 2. (now supported) couldn't use variables before definitions ### 2. (现在已经支持) 不能在定义前使用变量
(__Outdated__: this is now supported) Also there's another difference. (__过时信息__: 现在已经支持)
In Andy's interpreter:
```javascript ```javascript
var a=func {print(b);} var a=func {print(b);}
@ -1624,48 +1561,30 @@ var b=1;
a(); a();
``` ```
This program runs normally with output 1. 这个程序在andy的解释器中可以正常运行并输出内容。然而在这个新的解释器中你会得到:
But in this new interpreter, it will get:
```javascript ```javascript
[code] test.nas:1 undefined symbol "b". [code] test.nas:1 undefined symbol "b".
var a=func {print(b);} var a=func {print(b);}
``` ```
This difference is caused by different kinds of ways of lexical analysis. 这个差异主要是文法作用域分析带来的。在大多数的脚本语言解释器中,他们使用动态的分析方式来检测符号是不是已经定义过了。然而,这种分析方法的代价就是执行效率不会很高。为了保证这个解释器能以极高的速度运行,我使用的是静态的分析方式,用静态语言类似的管理方式来管理每个符号对应的内存空间。这样虚拟机就不需要在运行的时候频繁检查符号是否存在。但是这也带来了差异。在这里你只会得到`undefined symbol`,而不是大多数脚本语言解释器中那样可以正常执行。
In most script language interpreters,
they use dynamic analysis to check if this symbol is defined yet.
However, this kind of analysis is at the cost of lower efficiency.
To make sure the interpreter runs at higher efficiency,
i choose static analysis to manage the memory space of each symbol.
By this way, runtime will never need to check if a symbol exists or not.
But this causes a difference.
You will get an error of 'undefined symbol',
instead of nothing happening in most script language interpreters.
This change is __controversial__ among FGPRC's members. 这个差异在FGPRC成员中有 __争议__。所以在未来我可能还是会用动态的分析方法来迎合大多数的用户。
So maybe in the future i will use dynamic analysis again to cater to the habits of senior programmers.
(2021/8/3 update) __Now i use scanning ast twice to reload symbols. (2021/8/3 update) __现在我使用二次搜索抽象语法树的方式来检测符号是否会被定义所以在这次更新之后这个差异不复存在。__ 不过如果你直接获取一个还未被定义的变量的内容的话,你会得到一个空数据,而不是`undefined error`。
So this difference does not exist from this update.__
But a new difference is that if you call a variable before defining it,
you'll get nil instead of 'undefined error'.
### 3. default dynamic arguments not supported ### 3. 默认不定长参数
In this new interpreter, 这个解释器在运行时,函数不会将超出参数表的那部分不定长参数放到默认的`arg`中。所以你如果不定义`arg`就使用它,那你只会得到`undefined symbol`。
function doesn't put dynamic arguments into vector `arg` automatically.
So if you use `arg` without definition,
you'll get an error of `undefined symbol`.
## __Trace Back Info__ ## __堆栈追踪信息__
When the interpreter crashes, 当解释器崩溃时,它会反馈错误产生过程的堆栈追踪信息:
it will print trace back information:
### 1. native function `die` ### 1. 内置函数`die`
Function `die` is used to throw error and crash immediately. `die`函数用于直接抛出错误并终止执行。
```javascript ```javascript
func() func()
@ -1697,9 +1616,9 @@ vm stack(0x7fffcd21bc68<sp+80>, limit 10, total 12):
0x00000052 | nil | 0x00000052 | nil |
``` ```
### 2. stack overflow crash info ### 2. 栈溢出信息
Here is an example of stack overflow: 这是一个会导致栈溢出的例子:
```javascript ```javascript
func(f){ func(f){
@ -1731,9 +1650,9 @@ vm stack(0x7fffd3781d58<sp+80>, limit 10, total 8108):
0x00001ff2 | addr | 0x7fffd37a16e8 0x00001ff2 | addr | 0x7fffd37a16e8
``` ```
### 3. normal vm error crash info ### 3. 运行时错误
Error will be thrown if there's a fatal error when executing: 如果在执行的时候出现错误,程序会直接终止执行:
```javascript ```javascript
func(){ func(){
@ -1749,9 +1668,9 @@ vm stack(0x7fffff539c28<sp+80>, limit 10, total 1):
0x00000050 | num | 0 0x00000050 | num | 0
``` ```
### 4. detailed crash info ### 4. 详细的崩溃信息
Use command __`-d`__ or __`--detail`__ the trace back info will show more details: 使用命令 __`-d`__ 或 __`--detail`__ 后trace back信息会包含更多的细节内容:
```javascript ```javascript
hello hello
@ -1867,13 +1786,11 @@ local(0x7ffff42f3d68<sp+86>):
0x00000001 | str | <0x1932480> error occurred t... 0x00000001 | str | <0x1932480> error occurred t...
``` ```
## __Debugger__ ## __调试器__
In nasal v8.0 we added a debugger. 在v8.0版本中我们为nasal添加了调试器。现在我们可以在测试程序的时候同时看到源代码和生成的字节码并且单步执行。
Now we could see both source code and bytecode when testing program.
Use command `./nasal -dbg xxx.nas` to use the debugger, 使用这个命令`./nasal -dbg xxx.nas`来启用调试器,接下来调试器会打开文件并输出以下内容:
and the debugger will print this:
```javascript ```javascript
[debug] nasal debug mode [debug] nasal debug mode
@ -1901,7 +1818,7 @@ vm stack(0x7fffe05e3190<sp+79>, limit 5, total 0)
>> >>
``` ```
If want help, input `h` to get help. 如果需要查看命令的使用方法,可以输入`h`获取帮助信息。
```bash ```bash
<option> <option>
@ -1920,8 +1837,8 @@ If want help, input `h` to get help.
bk, break | set break point bk, break | set break point
``` ```
When running the debugger, you could see what is on stack. 当运行调试器的时候,你可以看到现在的操作数栈上到底有些什么数据。
This will help you debugging or learning how the vm works: 这些信息可以帮助你调试,同时也可以帮助你理解这个虚拟机是如何工作的:
```javascript ```javascript
source code: source code: