🤙 👢 🦔 美洲印第安人堡垒的字节机（不仅如此）（第3部分） 👩🏽‍⚖️ 🕳️ 👂🏼

2019年到了，新年假期即将结束。是时候开始记住字节，命令，变量，循环了……

这些假期我已经忘记了。必须一起记住！

今天，我们将为字节机做一个解释器。这是第三篇文章，第一部分在这里：第一部分，第二部分。

祝大家新年快乐，欢迎切入！

首先，我将回答fpauk的问题。这些问题是绝对正确的。现在，此字节机的体系结构使得我们可以使用直接处理器地址。但是在字节码中这些地址不是，它们是在系统启动后形成的。系统启动后，我们可以创建任何指针，并且此代码将在任何平台上正常工作。例如，可以使用var0命令获得变量或数组的地址。此命令可在任何平台上使用，并将返回特定于该平台的正确地址。然后，您可以根据需要使用该地址。

但是， fpauk是正确的。该地址不能以字节码存储。事实证明，我们可以编写独立于平台的代码，但是为此，我们需要付出一些努力。特别是，请确保地址不在字节码中。例如，如果将编译后的代码保存到文件中，它们就可以进入。它包含数据，并且可以是地址。例如，此处，上下文和其他变量的值。

要解决此问题，您需要将地址虚拟化。 x86处理器的寻址功能非常强大，在大多数情况下，它甚至都不会添加额外的命令。但是，我将继续使用当前架构，使用绝对地址。然后，当我们进行测试时，可以将地址重做为虚拟地址，并查看这将如何影响性能。这很有趣。

热身

现在进行一些锻炼。让我们制作另一部分小而有用的字节命令。这些将是命令nip，emit，1 +，+！，-!、计数，使用返回堆栈r>，> r，r @，字符串文字（“）和常量字1、2、3、4的工作字。 8.不要忘记将它们包括在命令表中。

这是这些命令的代码

b_nip = 0x39 bcmd_nip: pop rax mov [rsp], rax jmp _next b_emit = 0x81 bcmd_emit: pop rax mov rsi, offset emit_buf #   mov [rsi], al mov rax, 1 #   № 1 - sys_write mov rdi, 1 #  № 1 - stdout mov rdx, 1 #   push r8 syscall #   pop r8 jmp _next b_wp = 0x18 bcmd_wp: incq [rsp] jmp _next b_setp = 0x48 bcmd_setp: pop rcx pop rax add [rcx], rax jmp _next b_setm = 0x49 bcmd_setm: pop rcx pop rax sub [rcx], rax jmp _next b_2r = 0x60 bcmd_2r: pop rax sub rbp, 8 mov [rbp], rax jmp _next b_r2 = 0x61 bcmd_r2: push [rbp] add rbp, 8 jmp _next b_rget = 0x62 bcmd_rget: push [rbp] jmp _next b_str = 0x82 bcmd_str: movzx rax, byte ptr [r8] lea r8, [r8 + rax + 1] jmp _next b_count = 0x84 bcmd_count: pop rcx movzx rax, byte ptr [rcx] inc rcx push rcx push rax jmp _next b_num1 = 0x03 bcmd_num1: push 1 jmp _next b_num2 = 0x04 bcmd_num2: push 2 jmp _next b_num3 = 0x05 bcmd_num3: push 3 jmp _next b_num4 = 0x06 bcmd_num4: push 4 jmp _next b_num8 = 0x07 bcmd_num8: push 8 jmp _next

nip命令可删除堆栈顶部下方的单词。它等效于swap drop命令。有时这可能会有所帮助。

发出命令将一个字符推出堆栈。它使用相同的系统调用号码1，字符放置在长度为1的缓冲区中。

count命令非常简单-它从堆栈中获取带有计数器的行的地址，并将其转换为两个值-没有计数器的行的地址和长度。

b_2r，b_r2，b_rget命令是Fort单词r>，> r，r @。第一个从返回堆栈中取出单词并将其放在算术堆栈上。第二个执行相反的操作。第三个从返回堆栈中复制单词，将其放在算术之一中，返回堆栈不变。

b_setp和b_setm命令是单词+！和-！..它们从堆栈中获取值和地址，然后修改指定地址处的字，从而从堆栈中添加或删除值。

b_str命令具有任意长度的参数-带有计数器的行。该行位于命令字节之后的字节码中，并且该命令仅将该行的地址压入堆栈。实际上，这是字符串文字。

我认为，团队的其他成员不需要评论。

我们还将发出命令以打印常量字符串（。“）。将其实现为键入的入口点，如下所示：

 b_strp = 0x83 bcmd_strp: movsx rax, byte ptr [r8] inc r8 push rax push r8 add r8, rax b_type = 0x80 bcmd_type: mov rax, 1 #   № 1 - sys_write mov rdi, 1 #  № 1 - stdout pop rdx #   pop rsi #   push r8 syscall #   pop r8 jmp _next

该命令的结构类似于b_str。只有她什么也不放。该命令后面作为参数的行仅显示给用户。

预热已经结束，现在是时候应对更严重的事情了。让我们处理单词生成器和其他var命令。

生成词

调用变量。我们知道它们是如何在字节码级别排列的（var0命令）。要创建新变量，堡垒使用以下构造：

 variable < >

执行此序列后，将创建一个新单词<variable name>。这个新单词的执行将地址压入堆栈以存储变量的值。在要塞中也有常量，它们是这样创建的：

 <> constant < >

创建常量后，单词<constant name>的执行将放在堆栈<value>上。

因此，单词变量和单词常量都是生成器单词。它们旨在创建新单词。在堡垒中，使用create ... does>构造描述了此类单词。

变量和常量可以定义如下：

 : variable create 0 , does> ; : constant create , does> @ ;

这一切是什么意思？

单词create在执行时会创建一个新单词，其名称将从输入流中执行时将采用的名称。创建后，将先执行一系列单词，然后再执行单词>。但是在执行该单词时，将执行dos>之后写的内容。同时，数据地址将已经在堆栈上（正如他们在要塞中所说的“数据字段”中所述）。

因此，在创建变量时，将执行序列“ 0”-这是对零填充的机器字的保留。当创建的单词被执行时，什么也没做（在做什么之后什么都没有）。存储值的内存地址只是保留在堆栈中。

在常量的定义中，保留了一个在堆栈中填充值的单词。执行创建的单词时，将执行“ @”，以检索指定地址处的值。

现在，让我们考虑一下如何创建我们创建的单词。它将数据地址压入堆栈（如var0），然后将控制权转移到特定地址字节码。 var0命令立即返回。但是在这种情况下，我们不需要做出回报，而实际上需要做出过渡。

我将再次阐述需要做的事情：

将数据地址放在堆栈上
在执行>后跳转到一段代码

事实证明，您只需要将控制权转移到另一个字节码地址，但首先将下一个字节（R8）的地址放在堆栈中即可。

这几乎是一个分支命令！在这里，她并不孤单。已经有branch8和branch16。我们将命名新的var8和var16命令，并将它们作为分支命令的入口点。我们省去了过渡团队的过渡：)因此，它将是这样的：

 b_var8 = 0x29 bcmd_var8: push r8 b_branch8 = 0x10 bcmd_branch8: movsx rax, byte ptr [r8] add r8, rax jmp _next b_var16 = 0x30 bcmd_var16: push r8 b_branch16 = 0x11 bcmd_branch16: movsx rax, word ptr [r8] add r8, rax jmp _next

很好的方法是，var32命令仍然可以使用，var64也可以。我们没有那么长的过渡期，因为普通的过渡期没有那么长。但是对于var命令，这是一个非常现实的情况。但是现在，我们将不执行这些命令。如有必要，我们将在以后做。

随着单词发生器整理出来。该轮到决定字典了。

词汇量

通常，当他们简单地谈论堡垒字典时，它以字典条目的单向列表的形式出现。实际上，由于堡垒支持许多词典，因此一切都有些复杂。实际上，它们是一棵树。在这样的树中搜索单词以“ sheet”开始-这是当前词典中的最后一个单词。当前字典由上下文变量定义，最后一个单词的地址在字典单词中。另一个变量用于管理字典-它定义了一个词典，将在其中添加新单词。因此，可以安装一个字典以进行搜索，而另一个字典可以包含新单词。

对于我们的简单情况，有可能不提供许多词典的支持，但我决定不简化任何内容。实际上，为了理解字节码，字节机，没有必要知道本节中描述的内容。因此，对于那些不感兴趣的人，您可以直接跳过此部分。好吧，谁想知道细节-继续！

最初，有一个基本的字典命名。这意味着有一个这样的词-来。这个词也叫“字典”，有一些混淆。因此，当涉及到单词时，我将其称为字典单词。

使用以下构造创建新的字典：

 vocabulary <  >

这将创建一个名称为<created dictionary name>的单词。执行时，该词会将创建的字典设置为搜索的起始字典。

实际上，在词典单词中有一个指向该词典最后一篇文章的链接，从该链接开始搜索。并且在执行时，该词典词会在上下文变量中写入指向其数据字段的链接。

稍后，可以制作词汇一词，在当前实现中要塞上的词汇很简单地描述：

 : vocabulary create context @ , does> context ! ;

因此，创建单词。我们将使用var8命令。字节码“上下文！” 放在数据字段之后：

 forth: .byte b_var8 .byte does_voc - . - 1 .quad 0 # <--      .      ,    -    . does_voc: .byte b_call8 .byte context - . - 1 .byte b_set .byte b_exit

现在回到创建字典本身。

通常，在堡垒中，存储器中单词的描述被称为“字典条目”。用普通的话说，我有一个文章标题及其代码。但是，并不是所有的事情都在一个堡垒中出现；它被称为“名称字段”，“通信字段”，“代码字段”和“数据字段”。我会尽力告诉您这一切在传统意义上的含义。

名称字段是单词“带有计数器的行”的名称。就像在旧的pascal中一样-字符串长度的字节，然后是字符串。链接字段是上一篇文章的链接。以前只有一个地址，但是我们将有一个与平台无关的代码，这将是一个偏移量。代码字段（通常在要塞中）是机器代码（当实现是在直线上时），对于内核外部的单词，则调用_call。我们将只有一个字节码。数据字段用于包含数据的单词-例如，变量或常量。顺便说一句，字典这个词也指它。

对于编译器，我们仍然需要标志。通常，一个堡垒只需要一个标志-立即，并放置在一个长字节中（有时还有另一个-隐藏）。但这是针对直接缝制的代码，在调用代码字段时，将转移处理器控制权。而且我们有不同的词-字节码和机器码，并且至少需要两个甚至三个标志。

通信领域需要多少？一开始，我想使用16位。这是前一个单词的链接，该单词肯定小于64 Kb。但是后来我想起了这个词可以包含几乎任何大小的数据。此外，在存在多个词典的情况下，链接可能会涉及许多单词。事实证明，在大多数情况下，8位就足够了，但可以是16位和32位。如果数据超过4 GB，甚至64位。好吧，让我们为所有选项提供支持。使用哪个选项-放入标志。事实证明，至少有4个标志：立即属性，核心字属性和所使用的通信字段的每个变体2位。没有其他方法，必须为标志使用单独的字节。

我们将标志定义如下：

 f_code = 0x80 f_immediate = 0x60

f_code标志将用于用汇编程序编写的内核单词，f_immediate标志将对编译器有用，有关此内容，请参阅下一篇文章。并且两个最低有效位将确定通信字段的长度（1、2、4或8个字节）。

因此，文章标题将如下所示：

标志（1个字节）
通信字段（1-8字节）
名称长度字节
名称（1-255字节）

到目前为止，我还没有使用“宏”汇编器的功能。现在我们需要它们。这是我获得带有名称项的宏以形成单词标题的方式：

 .macro item name, flags = 0 link = . - p_item 9: .if link >= -256/2 && link < 256/2 .byte \flags .byte link .elseif link >= -256*256/2 && link < 256*256/2 .byte \flags | 1 .word . - p_item .elseif link >= -256*256*256*256/2 && link < 256*256*256*256/2 .byte \flags | 2 .int . - p_item .elseif link >= -256*256*256*256*256*256*256*256/2 && link < 256*256*256*256*256*256*256*256/2 .byte \flags | 3 .quad . - p_item .endif p_item = 9b .byte 9f - . - 1 .ascii "\name" 9: .endm

此宏使用p_item值-这是上一个字典条目的地址。最后将更新此值以供将来使用：p_item = 9b。这里9b是标签，而不是数字，请不要混淆:)宏具有两个参数-单词的名称和标志（可选）。在宏的开头，计算到前一个单词的偏移量。然后，根据偏移量的大小，编译所需大小的标志和通信字段。然后是名称长度和名称本身的字节。

在第一个单词p_item之前定义如下：

 p_item = .

该点是汇编器中的当前编译地址。作为此定义的结果，第一个单词将指向自身（通信字段将为0）。这是字典结束的标志。

顺便说一句，内核单词的代码字段将是什么？至少，您必须将命令代码保存在某个地方。我决定走最简单的道路。对于内核单词，还将有一个字节码。对于大多数团队来说，这只是一个字节命令，后跟b_exit。因此，对于解释器而言，不需要分析f_code标志，并且用于该命令的命令也不会有任何不同。您只需要为每个人调用字节码。

此选项还有另一个优点。对于带有参数的命令，可以指定安全参数。例如，如果您在Fort实现中使用直接缝合的代码调用lit命令，则系统将崩溃。并将其写入此处，例如，点亮0，并且此序列将简单地将0放入堆栈。即使是分支也可以安全地完成！

  .byte branch8 .byte 0f - . 0: .byte b_exit

进行这样的呼叫会产生一些开销，但是对于解释器而言，它们并不重要。然后编译器将分析这些标志，并编译正确且快速的代码。

当然，第一个单词将是“第四”，这是我们正在创建的基本词汇。在这里，只需输入一个方便的var命令，然后在dos>之后链接到代码即可。我已经在上一节中引用了此代码，但是我将再次重复以下标题：

 p_item = . item forth .byte b_var8 .byte does_voc - . - 1 .quad 0 does_voc: .byte b_call8 .byte context - . - 1 .byte b_set .byte b_exit

我们将立即创建上下文变量，并需要它们来搜索单词：

  item .byte b_var0 .quad 0 item context context: .byte b_var0 .quad 0

现在，您需要耐心等待，并为我们在汇编器中使用f_code标志编写的每个单词写一个标题：

  item 0, f_code .byte b_num0 .byte b_exit item 1, f_code .byte b_num1 .byte b_exit ... item 1-, f_code .byte b_wm .byte b_exit item 1+, f_code .byte b_wp .byte b_exit item +, f_code .byte b_add .byte b_exit item -, f_code .byte b_sub .byte b_exit item *, f_code .byte b_mul .byte b_exit

依此类推...

用字节码编写团队更加容易。只需在字节码前添加一个标题就足够了，例如：

  item hold hold: .byte b_call8 .byte holdpoint - . - 1 # holdpoint ...

对于带有参数的命令，我们将设置安全的参数。例如，让lite命令返回数字Pi，如果有人以交互方式调用它们，将会出现这样的复活节:)

  item lit8, f_code .byte b_lit8 .byte 31 .byte b_exit item lit16, f_code .byte b_lit16 .word 31415 .byte b_exit item lit32, f_code .byte b_lit32 .int 31415926 .byte b_exit item lit64, f_code .byte b_lit64 .quad 31415926535 .byte b_exit

列表中的最后一个单词将使该单词具有象征意义。但是我们仍然需要在数据字段中初始化该字的地址。要获取此单词的地址，请使用var0命令：

 last_item: .byte b_var0 item bye, f_code .byte b_bye

在此设计中，如果我们在字节码中调用地址last_item，则将获得单词bye的地址。要将其写入到单词的数据字段中，请执行，然后所需的地址将处于上下文中。因此，系统初始化代码将如下所示：

 forth last_item context @ !

现在让我们直接进入解释器。首先，我们需要使用输入缓冲区并从中提取单词。让我提醒您，堡垒中的口译员非常简单。他按顺序从输入缓冲区中提取单词，然后尝试查找它们。如果找到该单词，则解释器将其启动以执行。

输入缓冲区和字提取

老实说，我不想花很多时间研究要塞的标准。但是我仍然会尝试使它尽可能接近它们，主要是从内存上。如果要塞专家在这里看到很大的差异-请写，我会解决它。

堡垒具有三个用于缓冲区的变量：tib，＃tib和> in。 tib变量将输入缓冲区的地址压入堆栈。变量#tib将缓冲区中的字符数压入堆栈。变量> in包含输入缓冲区中的偏移量，原始文本超出该偏移量。定义这些变量。

  item tib .byte b_var0 v_tib: .quad 0 item #tib .byte b_var0 v_ntib: .quad 0 item >in .byte b_var0 v_in: .quad 0

接下来，我们将单词blword作为单词。该单词使用指定的变量从输入流中获取下一个单词。空格用作定界符，所有字符的代码均小于空格。这个词将在汇编器中。调试后，结果如下：

 b_blword = 0xF0 bcmd_blword: mov rsi, v_tib #    mov rdx, rsi #   RDX       mov rax, v_in #     mov rcx, v_ntib #    add rsi, rax #  RSI -      sub rcx, rax #     jz 3f word2: lodsb #   AL  RSI   cmp al, ' ' ja 1f #    (    ) dec rcx jnz word2 #    3: sub rsi, rdx mov v_in, rsi push rcx jmp _next 1: lea rdi, [rsi - 1] # RDI = RSI - 1 ( ) dec rcx word3: lodsb cmp al, ' ' jbe 2f dec rcx jnz word3 2: mov rax, rsi sub rsi, rdx #        (   ) mov v_in, rsi sub rax, rdi dec rax jz word1 push rdi #   word1: push rax #   jmp _next

该单词与标准单词相似，但是与标准单词不同，它考虑了所有定界符，并且不将该单词复制到缓冲区。它仅在堆栈上返回两个值-地址和长度。如果无法检索到该单词，则返回0。现在是时候开始编写解释器了。

单词搜索和翻译

首先，让我们解释一下。该单词使用blworld从缓冲区中选择一个新单词，然后在字典中搜索并执行它。如此重复直到缓冲区用完为止。我们仍然没有搜索单词的能力，因此我们将编写一个测试存根，仅使用type从缓冲区中打印单词。这将使我们有机会检查和调试blworld：

 # : interpret begin blword dup while type repeat drop ; item interpret 1: .byte b_blword .byte b_dup .byte b_qnbranch8 .byte 0f - . .byte b_type .byte b_branch8 .byte 1b - . 0: .byte b_drop .byte b_exit

现在，使这个词退出。通常，他们在实施要塞系统时会这样做：他们使用“退出”或“中止”一词进入解释器模式。单词quit刷新堆栈并开始缓冲区输入和解释的无限循环。对于我们来说，这只是一个口译服务。该单词的代码将由两部分组成。第一部分在汇编器中，第二部分在字节码中。第一部分：

 b_quit = 0xF1 bcmd_quit: lea r8, quit mov sp, init_stack mov bp, init_rstack jmp _next

第二部分：

 quit: .byte b_call16 .word interpret - . - 2 .byte b_bye

通常，汇编代码位于.text节中，字节码位于.data节中。

最后，更改起始字节码。只会初始化字典，在起始行上设置缓冲区，然后调用quit。

 # forth last_item context @ ! start_code tib ! <  > #tib ! quit start: .byte b_call16 .word forth - . - 2 .byte b_call16 .word last_item - . - 2 .byte b_call16 .word context - . - 2 .byte b_get .byte b_set .byte b_call8 .byte start_code - . - 1 .byte b_call16 .word tib - . - 2 .byte b_set .byte b_lit16 .world 1f - 0f .byte b_call16 .word ntib - . - 2 .byte b_set .byte b_quit start_code: .byte b_var0 0: .ascii "word1 word2 word3" 1:

编译，链接，运行！

 $ as forth.s -o forth.o -g -ahlsm >list.txt $ ld forth.o -o forth $ ./forth word1word2wordBye!

有点像粥，但这正是结果。我们输出时没有定界符。顺便说一句，在将来购买之前先将换行符放进去，这不会造成伤害。

当然，我不得不修补调试。除了已经提到的“分段故障（堆芯故障）”之外，有时还会获得有趣的结果。例如，这：

 $ ./forth word1word2word3forth)%60Acurrent(context(%600lit8lit16zlit32v%5E%DF%80lit64v%5E%DF%80call8call16call32branch8branch16qbranch8qbranch16exit1-+!-%22*#/$mod%25/mod&abs'dup0drop1swap2rot3-rot4over5pick6roll7depth8@@!Ac@Bc!Cw@Dw!Ei@Fi!G0=P0%3CQ0%3ER=S%3CT%3EU%3C=V%3E=Wvar8)var160base(holdbuf(Qholdpoint(hold@0U110ACp@&20T0!?!%3CgF!A0@RF!5%220'%DE%A61Q-%DD%80:tib(%7F%60(%3Ein(%20%20%20%20%20%20%20interpret01('byeSegmentation%20fault%20(core%20dumped)

这似乎是我们整个二进制字典，其中的文本被切成定界符：)当我在b_blword命令中的word3之前忘记“ dec rcx”时，就发生了这种情况。

我们可以从输入流中选择单词，还有一个字典。现在，您需要实现字典搜索并启动要执行的单词。这将需要单词find，cfa和execute。

查找单词将从堆栈中获取单词的地址及其长度。该单词将由字典条目的地址返回，如果找不到，则返回0。

文章地址上的单词cfa将计算可执行字节码的地址。

然后单词execute将执行字节码。

让我们从查找开始。在要塞标准中，它需要一个地址-带有计数器的行。但是我不想再次将字符串复制到缓冲区，因此我将稍微偏离标准。查找一词将在堆栈上使用两个参数-字符串的地址和长度（实际上，返回单词blword）。调试后，该词采用以下形式：

 b_find = 0xF2 bcmd_find: pop rbx #   pop r9 #   mov rdx, v_context mov rdx, [rdx] #        #   find0: mov al, [rdx] #  and al, 3 #   -     ,     ,    or al, al jz find_l8 cmp al, 1 jz find_l16 cmp al, 2 jz find_l32 mov r10, [rdx + 1] #  64  lea rsi, [rdx + 9] #   jmp find1 find_l32: movsx r10, dword ptr [rdx + 1] #  32  lea rsi, [rdx + 5] #   jmp find1 find_l16: movsx r10, word ptr [rdx + 1] #  16  lea rsi, [rdx + 3] #   jmp find1 find_l8: movsx r10, byte ptr [rdx + 1] #  8  lea rsi, [rdx + 2] #   find1: movzx rax, byte ptr [rsi] #       cmp rax, rbx jz find2 #      find3: or r10, r10 jz find_notfound #  ,    add rdx, r10 #     jmp find0 #  ,   find2: inc rsi mov rdi, r9 mov rcx, rax repz cmpsb jnz find3 #   push rdx jmp _next find_notfound: push r10 jmp _next

也许这是今天最困难的词。现在，我们修改“解释”一词，将类型替换为“ find”：

 # : interpret begin blword dup while find . repeat drop ; item interpret interpret: .byte b_blword .byte b_dup .byte b_qnbranch8 .byte 0f - . .byte b_find .byte b_call16 .word dot - . - 2 .byte b_branch8 .byte interpret - . 0: .byte b_drop .byte b_exit

在测试行中，您需要放入词典中的单词，例如“ 0 1- dup +”。

一切准备就绪！

 $ ld forth.o -o forth $ ./forth 6297733 6297898 6298375 Bye!

太好了，搜索有效。这些是单词的地址（十进制）。现在这个词cfa。让它也处于汇编器中，这非常简单，使用标志的操作类似于find：

 b_cfa = 0xF3 bcmd_cfa: pop rdx #    mov al, [rdx] #  and al, 3 #   -     ,     ,    or al, al jz cfa_l8 cmp al, 1 jz cfa_l16 cmp al, 2 jz cfa_l32 lea rsi, [rdx + 9] #   (64  ) jmp cfa1 find_l32: lea rsi, [rdx + 5] #   (32  ) jmp cfa1 find_l16: lea rsi, [rdx + 3] #   (16  ) jmp cfa1 find_l8: lea rsi, [rdx + 2] #   (8  ) xor rax, rax lodsb add rsi, rax push rsi jmp _next

最后，执行一词更简单：

 b_execute = 0xF4 bcmd_execute: sub rbp, 8 mov [rbp], r8 #       pop r8 #  - jmp _next

更正单词“解释并运行”！

 # : interpret begin blword dup while find cfa execute repeat drop ; item interpret interpret: .byte b_blword .byte b_dup .byte b_qnbranch8 .byte 0f - . .byte b_find .byte b_cfa .byte b_execute .byte b_branch8 .byte interpret - . 0: .byte b_drop .byte b_exit

发射：

 $ as forth.s -o forth.o -g -ahlsm >list.txt $ ld forth.o -o forth $ ./forth -2 Bye!

Urrra，赚了！~~（C）Cat Matroskin~~

的确，如果您从0减去1并将结果加到自己身上，它将是-2 :)
这很好，但是我仍然想从键盘上键入命令。而且，还有另一个问题-我们的解释器仅理解数字0、1、2、3、4和8（它们被定义为常量）。他会学到什么数字，您需要“数字”一词。以与查找单词相同的方式，我将不使用缓冲区。 “数字”一词将在堆栈上使用两个参数-字符串的地址和长度。如果成功，它将返回接收到的数字和标志1。如果转换不成功，则堆栈上将有一个数字：0

。代码原来很长，但是很简单而且线性：

 b_number = 0xF5 bcmd_number: pop rcx #   pop rsi #  xor rax, rax #   xor rbx, rbx #     mov r9, v_base #  xor r10, r10 #   or rcx, rcx jz num_false mov bl, [rsi] cmp bl, '+' jnz 1f inc rsi dec rcx jz num_false jmp num0 1: cmp bl, '-' jnz num0 mov r10, 1 inc rsi dec rcx jz num_false num0: mov bl, [rsi] cmp bl, '0' ja num_false cmp bl, '9' jae num_09 cmp bl, 'A' ja num_false cmp bl, 'Z' jae num_AZ cmp bl, 'a' ja num_false sub bl, 'a' - 10 jmp num_check num_AZ: sub bl, 'A' - 10 jmp num_check num_09: sub bl, '0' num_check: cmp rbx, r9 jge num_false add rax, rbx mul r9 inc rsi dec rcx jnz num0 or r10, r10 push rax push 1 jmp _next num_false: xor rcx, rcx push rcx jmp _next

修改解释。如果单词不在词典中，我们将尝试将其解释为数字：

 # : interpret # begin # blword dup # while # over over find dup # if -rot drop drop cfa execute else number? drop then # repeat # drop ; item interpret interpret: .byte b_blword .byte b_dup .byte b_qnbranch8 .byte 0f - . .byte b_over .byte b_over .byte b_find .byte b_dup .byte b_qnbranch8 .byte 1f - . .byte b_mrot .byte b_drop .byte b_drop .byte b_cfa .byte b_execute .byte b_branch8 .byte 2f - . 1: .byte b_numberq .byte b_drop 2: .byte b_branch8 .byte interpret - . 0: .byte b_drop .byte b_exit last_item: .byte b_var0 item bye, f_code .byte b_bye

我到了！在汇编程序中调试这样的字节码，在字节码中没有断点，没有能力仅仅沿着字节码“步进”……而且，堆栈上没有最简单的移动，并且没有简单的能力查看堆栈的内容……在GDB上，只是命令行...我会告诉您-只是大脑爆炸！不差这是一个大脑爆炸！

但是...我们是印第安人，我们总会找到解决方法：)

通常，我找到了这种解决方案：我实现了一个显示堆栈内容的命令-“ s”。该命令不是最简单的，但仍然更易于解释。而且，事实证明，ochchchen有用。这是：

 # : .s depth dup . c": emit do dup while dup pick . 1- again drop ; item .s # 11 22 33 prstack: .byte b_depth # 11 22 33 3 .byte b_dup # 11 22 33 3 3 .byte b_lit8 .byte '(' .byte b_emit .byte b_call16 # 11 22 33 3 .word dot - . - 2 .byte b_strp # 11 22 33 3 .byte 3 .ascii "): " 1: .byte b_dup # 11 22 33 3 3 .byte b_qnbranch8 # 11 22 33 3 .byte 2f - . .byte b_dup # 11 22 33 3 3 .byte b_pick # 11 22 33 3 11 .byte b_call16 # 11 22 33 3 .word dot - . - 2 .byte b_wm # 11 22 33 2 .byte b_branch8 .byte 1b - . 2: .byte b_drop # 11 22 33 .byte b_exit

在右侧，给出了每个命令执行后堆栈内容的示例。当然，有一个循环，这只是第一步。但是其余部分非常相似，只是堆栈顶部的值发生了变化。经过这样的“追踪”，团队立即获得了收益！

为了进行调试，我创建了以下宏：

 .macro prs new_line = 1 .byte b_call16 .word prstack - . - 2 .if \new_line > 0 .byte b_lit8, '\n' .byte b_emit .endif .endm

通过以这种方式插入正确的位置来使用：

  item interpret interpret: .byte b_blword prs .byte b_dup prs .byte b_qnbranch8 .byte 0f - . .byte b_over .byte b_over ......

结果，第一次启动产生了以下输出：

 $ ./forth (2 ): 6297664 1 (3 ): 6297664 1 1 (3 ): 2 6297666 1 (4 ): 2 6297666 1 1 (4 ): 2 3 6297668 1 (5 ): 2 3 6297668 1 1 (3 ): 6 6297670 2 (4 ): 6 6297670 2 2 (4 ): 6 6297670 6297673 1 (5 ): 6 6297670 6297673 1 1 6297670 (2 ): 6 0 (3 ): 6 0 0 Bye!

可以清楚地看到堆栈上的每个移动。有必要早点做：)

我通过制作另一个调试宏来进一步：

 .macro pr string .byte b_strp .byte 9f - 8f 8: .ascii "\n\string" 9: .endm

结果，可以做到这一点：

  item interpret interpret: .byte b_blword pr blworld prs .byte b_dup .byte b_qnbranch8 .byte 0f - . .byte b_over .byte b_over prs .byte b_find pr find prs .byte b_dup .byte b_qnbranch8 .byte 1f - . .byte b_mrot .byte b_drop .byte b_drop .byte b_cfa pr execute prs .byte b_execute .byte b_branch8 .byte 2f - . 1: .byte b_numberq pr numberq prs .byte b_drop 2: .byte b_branch8 .byte interpret - . 0: .byte b_drop .byte b_exit

并得到这个：

 $ ./forth blworld(2 ): 6297664 2 (4 ): 6297664 2 6297664 2 find(3 ): 6297664 2 0 numberq(2 ): 6297664 0 blworld(3 ): 6297664 6297667 2 (5 ): 6297664 6297667 2 6297667 2 find(4 ): 6297664 6297667 2 0 numberq(3 ): 6297664 6297667 0 blworld(4 ): 6297664 6297667 6297670 1 (6 ): 6297664 6297667 6297670 1 6297670 1 find(5 ): 6297664 6297667 6297670 1 6297958 execute(3 ): 6297664 6297667 6297962 blworld(3 ): 39660590749888 6297672 1 (5 ): 39660590749888 6297672 1 6297672 1 find(4 ): 39660590749888 6297672 1 6298496 execute(2 ): 39660590749888 6298500 39660590749888 blworld(1 ): 0 Bye!

试图解释字符串“ 20 30 *。”。

然后您可以显示源行号……好吧，也许吧……

当然，这是一种用于调试的经典日志记录技术，但是我没有立即记住这一点。

通常，作为调试的结果，我发现一个堆栈出国了。当他们尝试付出比他们付出的更多时，这与溢出相反。将她的控件添加到“ .s”。
借助新的宏，调试速度很快。顺便说一下，在此之前，我每行发布一个字节码。但是汇编器允许您在字符串中放置几个字节，为什么不使用它。

让我们通过添加两个检查来结束“解释”一词：该词尚未转换为数字，然后退出国外。结果，解释如下：

  item interpret interpret: .byte b_blword .byte b_dup .byte b_qnbranch8 .byte 0f - . .byte b_over .byte b_over .byte b_find .byte b_dup .byte b_qnbranch8 .byte 1f - . .byte b_mrot .byte b_drop .byte b_drop .byte b_cfa .byte b_execute .byte b_branch8 .byte 2f - . 1: .byte b_drop .byte b_over, b_over .byte b_numberq # ,    .byte b_qbranch8, 3f - . #     0, ,      3 .byte b_type #    .byte b_strp #   .byte 19 #     .ascii " : word not found!\n" .byte b_quit #    3: .byte b_nip, b_nip #  ,     ( b_over, b_over) 2: #       .byte b_depth #    .byte b_zlt # ,   0 ( 0<) .byte b_qnbranch8, interpret_ok - . #   ,    ,   .byte b_strp #    .byte 14 .ascii "\nstack fault!\n" .byte b_quit #    interpret_ok: .byte b_branch8 .byte interpret - . 0: .byte b_drop .byte b_exit

顺便说一句，值得注意的是，现在quit命令刷新堆栈并再次开始解释而不会更改缓冲区的状态。因此，解释继续进行，但堆栈是“新鲜的”。我们将在稍后解决此问题。

剩下的唯一事情就是组织键盘输入。

键盘输入

堡垒中的键盘输入很简单。“期望”一词有两个参数-缓冲区的地址及其大小。这个词执行键盘输入。输入的实际字符数放置在span变量中。让我们说这些话。我们将从标准输入中输入。

 .data item span span: .byte b_var0 v_span: .quad 0 .text b_expect = 0x88 bcmd_expect: mov rax, 0 #   № 1 - sys_read mov rdi, 0 #  № 1 - stdout pop rdx #   pop rsi #   push r8 syscall #   pop r8 mov rbx, rax or rax, rax jge 1f xor rbx, rbx 1: mov v_span, rbx jmp _next

现在我们需要创建一个键盘输入缓冲区。使其长度为256个字符。
让我们代替之前的测试行。

 inbuf_size = 256 inbuf: .byte b_var0 .space inbuf_size

然后我们修改quit以及起始字节码。将tib变量设置为inbuf输入缓冲区，调用Expect，然后将值从span复制到#tib。变量中的>无效；我们称为解释。因此，我们重复一个循环。有一些小玩意-添加输入提示，可以很好地显示堆栈的状态（并且我们已经有一个现成的命令！）。经过几次迭代，我们得到了以下代码（start和quit命令）：

 # forth last_item context @ ! quit start: .byte b_call16 .word forth - . - 2 .byte b_call16 .word last_item - . - 2 .byte b_call16 .word context - . - 2 .byte b_get .byte b_set .byte b_quit inbuf: .byte b_var0 .space inbuf_size # begin inbuf dup tib ! inbuf_size expect span @ #tib ! 0 >in ! interpret again quit: .byte b_strp, 1 .ascii "\n" .byte b_call16 .word prstack - . - 2 .byte b_strp .byte 2 .ascii "> " .byte b_call16 .word inbuf - . - 2 .byte b_dup .byte b_call16 .word tib - . - 2 .byte b_set .byte b_lit16 .word inbuf_size .byte b_expect .byte b_call16 .word span - . - 2 .byte b_get .byte b_call16 .word ntib - . - 2 .byte b_set .byte b_num0 .byte b_call16 .word bin - . - 2 .byte b_set .byte b_call16 .word interpret - . - 2 .byte b_branch8, quit - .

结果如下：

 $ ./forth ( 0 ): > 60 ( 1 ): 60 > 60 24 ( 3 ): 60 60 24 > rot ( 3 ): 60 24 60 > -rot ( 3 ): 60 60 24 > swap ( 3 ): 60 24 60 > * * . 86400 ( 0 ): > 200 30 /mod ( 2 ): 20 6 > bye Bye! $

“>”符号后的所有内容都是我的键盘输入。剩下的就是系统的答案。我用键盘打了一些命令。他执行了几次堆栈操作，计算了以天为单位的秒数。

总结

口译员已经完成并且正在工作。并礼貌地说再见-向他“再见”，然后他“再见” :)
作为邀请-算术堆栈的内容。括号中的第一个数字是堆栈的大小，然后是内容，以及输入“>”的提示。您可以输入任何已实现的命令（我算出76条命令）。的确，许多内容仅对编译器有意义-例如，文字，过渡，调用命令。

全文（约1300行）

 .intel_syntax noprefix stack_size = 1024 f_code = 0x80 f_immediate = 0x60 .macro item name, flags = 0 link = p_item - . 9: .if link >= -256/2 && link < 256/2 .byte \flags .byte link .elseif link >= -256*256/2 && link < 256*256/2 .byte \flags | 1 .word link .elseif link >= -256*256*256*256/2 && link < 256*256*256*256/2 .byte \flags | 2 .int link .elseif link >= -256*256*256*256*256*256*256*256/2 && link < 256*256*256*256*256*256*256*256/2 .byte \flags | 3 .quad link .endif p_item = 9b .byte 9f - . - 1 .ascii "\name" 9: .endm .section .data init_stack: .quad 0 init_rstack: .quad 0 emit_buf: .byte 0 inbuf_size = 256 msg_bad_byte: .ascii "Bad byte code!\n" msg_bad_byte_len = . - msg_bad_byte #  len    msg_bye: .ascii "\nBye!\n" msg_bye_len = . - msg_bye bcmd: .quad bcmd_bad, bcmd_bye, bcmd_num0, bcmd_num1, bcmd_num2, bcmd_num3, bcmd_num4, bcmd_num8 # 0x00 .quad bcmd_lit8, bcmd_lit16, bcmd_lit32, bcmd_lit64, bcmd_call8, bcmd_call16, bcmd_call32, bcmd_bad .quad bcmd_branch8, bcmd_branch16, bcmd_qbranch8, bcmd_qbranch16, bcmd_qnbranch8, bcmd_qnbranch16,bcmd_bad, bcmd_exit # 0x10 .quad bcmd_wp, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_wm, bcmd_add, bcmd_sub, bcmd_mul, bcmd_div, bcmd_mod, bcmd_divmod, bcmd_abs # 0x20 .quad bcmd_var0, bcmd_var8, bcmd_var16, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_dup, bcmd_drop, bcmd_swap, bcmd_rot, bcmd_mrot, bcmd_over, bcmd_pick, bcmd_roll # 0x30 .quad bcmd_depth, bcmd_nip, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_get, bcmd_set, bcmd_get8, bcmd_set8, bcmd_get16, bcmd_set16, bcmd_get32, bcmd_set32 # 0x40 .quad bcmd_setp, bcmd_setm, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_zeq, bcmd_zlt, bcmd_zgt, bcmd_eq, bcmd_lt, bcmd_gt, bcmd_lteq, bcmd_gteq # 0x50 .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_2r, bcmd_r2, bcmd_rget, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad # 0x60 .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_type, bcmd_emit, bcmd_str, bcmd_strp, bcmd_count, bcmd_bad, bcmd_bad, bcmd_bad # 0x80 .quad bcmd_expect, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad # 0x90 .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad .quad bcmd_blword, bcmd_quit, bcmd_find, bcmd_cfa, bcmd_execute, bcmd_numberq, bcmd_bad, bcmd_bad # 0xF0 .quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad # forth last_item context @ ! quit start: .byte b_call16 .word forth - . - 2 .byte b_call16 .word last_item - . - 2 .byte b_call16 .word context - . - 2 .byte b_get .byte b_set .byte b_quit inbuf: .byte b_var0 .space inbuf_size # begin inbuf dup tib ! inbuf_size expect span @ #tib ! 0 >in ! interpret again quit: .byte b_strp, 1 .ascii "\n" .byte b_call16 .word prstack - . - 2 .byte b_strp .byte 2 .ascii "> " .byte b_call16 .word inbuf - . - 2 .byte b_dup .byte b_call16 .word tib - . - 2 .byte b_set .byte b_lit16 .word inbuf_size .byte b_expect .byte b_call16 .word span - . - 2 .byte b_get .byte b_call16 .word ntib - . - 2 .byte b_set .byte b_num0 .byte b_call16 .word bin - . - 2 .byte b_set .byte b_call16 .word interpret - . - 2 .byte b_branch8, quit - . p_item = . item forth forth: .byte b_var8 .byte does_voc - . .quad 0 does_voc: .byte b_call8 .byte context - . - 1 .byte b_set .byte b_exit item current .byte b_var0 .quad 0 item context context: .byte b_var0 v_context: .quad 0 item 0, f_code .byte b_num0 .byte b_exit item 1, f_code .byte b_num1 .byte b_exit item 2, f_code .byte b_num2 .byte b_exit item 3, f_code .byte b_num3 .byte b_exit item 4, f_code .byte b_num4 .byte b_exit item 8, f_code .byte b_num8 .byte b_exit item lit8, f_code .byte b_lit8 .byte 31 .byte b_exit item lit16, f_code .byte b_lit16 .word 31415 .byte b_exit item lit32, f_code .byte b_lit32 .int 31415926 .byte b_exit item lit64, f_code .byte b_lit64 .quad 31415926 .byte b_exit item call8, f_code .byte b_call8 .byte 0f - . - 1 0: .byte b_exit item call16, f_code .byte b_call16 .word 0f - . - 2 0: .byte b_exit item call32, f_code .byte b_call32 .int 0f - . - 4 0: .byte b_exit item branch8, f_code .byte b_branch8 .byte 0f - . 0: .byte b_exit item branch16, f_code .byte b_branch16 .word 0f - . 0: .byte b_exit item qbranch8, f_code .byte b_qbranch8 .byte 0f - . 0: .byte b_exit item qbranch16, f_code .byte b_qbranch16 .word 0f - . 0: .byte b_exit item exit, f_code .byte b_exit item 1-, f_code .byte b_wm .byte b_exit item 1+, f_code .byte b_wp .byte b_exit item +, f_code .byte b_add .byte b_exit item -, f_code .byte b_sub .byte b_exit item *, f_code .byte b_mul .byte b_exit item /, f_code .byte b_div .byte b_exit item mod, f_code .byte b_mod .byte b_exit item /mod, f_code .byte b_divmod .byte b_exit item abs, f_code .byte b_abs .byte b_exit item dup, f_code .byte b_dup .byte b_exit item drop, f_code .byte b_drop .byte b_exit item swap, f_code .byte b_swap .byte b_exit item rot, f_code .byte b_rot .byte b_exit item -rot, f_code .byte b_mrot .byte b_exit item over, f_code .byte b_over .byte b_exit item pick, f_code .byte b_pick .byte b_exit item roll, f_code .byte b_roll .byte b_exit item depth, f_code .byte b_depth .byte b_exit item @, f_code .byte b_get .byte b_exit item !, f_code .byte b_set .byte b_exit item c@, f_code .byte b_get8 .byte b_exit item c!, f_code .byte b_set8 .byte b_exit item w@, f_code .byte b_get16 .byte b_exit item w!, f_code .byte b_set16 .byte b_exit item i@, f_code .byte b_get32 .byte b_exit item i!, f_code .byte b_set32 .byte b_exit item +!, f_code .byte b_setp .byte b_exit item -!, f_code .byte b_setm .byte b_exit item >r, f_code .byte b_2r .byte b_exit item r>, f_code .byte b_r2 .byte b_exit item r@, f_code .byte b_rget .byte b_exit item "0=", f_code .byte b_zeq .byte b_exit item 0<, f_code .byte b_zlt .byte b_exit item 0>, f_code .byte b_zgt .byte b_exit item "=", f_code .byte b_eq .byte b_exit item <, f_code .byte b_lt .byte b_exit item >, f_code .byte b_gt .byte b_exit item "<=", f_code .byte b_lteq .byte b_exit item ">=", f_code .byte b_gteq .byte b_exit item type, f_code .byte b_type .byte b_exit item expect, f_code .byte b_expect .byte b_exit item emit, f_code .byte b_emit .byte b_exit item count, f_code .byte b_count .byte b_exit item "(\")", f_code .byte b_str .byte b_exit item "(.\")", f_code .byte b_strp .byte b_exit item var8, f_code .byte b_var8 .byte 0f - . 0: .byte b_exit item var16, f_code .byte b_var16 .word 0f - . 0: .byte b_exit item base base: .byte b_var0 v_base: .quad 10 holdbuf_len = 70 item holdbuf holdbuf: .byte b_var0 .space holdbuf_len item holdpoint holdpoint: .byte b_var0 .quad 0 item span span: .byte b_var0 v_span: .quad 0 # : hold holdpoint @ 1- dup holdbuf > if drop drop else dup holdpoint ! c! then ; item hold hold: .byte b_call8 .byte holdpoint - . - 1 # holdpoint .byte b_get # @ .byte b_wm # 1- .byte b_dup # dup .byte b_call8 .byte holdbuf - . - 1 # holdbuf .byte b_gt # > .byte b_qbranch8 # if .byte 0f - . .byte b_drop # drop .byte b_drop # drop .byte b_branch8 #     ( then) .byte 1f - . 0: .byte b_dup # dup .byte b_call8 .byte holdpoint - . - 1 # holdpoint .byte b_set # ! .byte b_set8 # c! 1: .byte b_exit # ; # : # base /mod swap dup 10 < if c" 0 + else 10 - c" A + then hold ; item # conv: .byte b_call16 .word base - . - 2 # base .byte b_get # @ .byte b_divmod # /mod .byte b_swap # swap .byte b_dup # dup .byte b_lit8 .byte 10 # 10 .byte b_lt # < .byte b_qnbranch8 # if .byte 0f - . .byte b_lit8 .byte '0' # c" 0 .byte b_add # + .byte b_branch8 # else .byte 1f - . 0: .byte b_lit8 .byte '?' # c" A .byte b_add # + 1: .byte b_call16 .word hold - . - 2 # hold .byte b_exit # ; # : <# holdbuf 70 + holdpoint ! ; item <# conv_start: .byte b_call16 .word holdbuf - . - 2 .byte b_lit8 .byte holdbuf_len .byte b_add .byte b_call16 .word holdpoint - . - 2 .byte b_set .byte b_exit # : #s do # dup 0=until ; item #s conv_s: .byte b_call8 .byte conv - . - 1 .byte b_dup .byte b_qbranch8 .byte conv_s - . .byte b_exit # : #> holdpoint @ holdbuf 70 + over - ; item #> conv_end: .byte b_call16 .word holdpoint - . - 2 .byte b_get .byte b_call16 .word holdbuf - . - 2 .byte b_lit8 .byte holdbuf_len .byte b_add .byte b_over .byte b_sub .byte b_exit item . dot: .byte b_dup .byte b_abs .byte b_call8 .byte conv_start - . - 1 .byte b_lit8 .byte ' ' .byte b_call16 .word hold - . - 2 .byte b_call8 .byte conv_s - . - 1 .byte b_drop .byte b_zlt .byte b_qnbranch8 .byte 1f - . .byte b_lit8 .byte '-' .byte b_call16 .word hold - . - 2 1: .byte b_call8 .byte conv_end - . - 1 .byte b_type .byte b_exit item tib tib: .byte b_var0 v_tib: .quad 0 item #tib ntib: .byte b_var0 v_ntib: .quad 0 item >in bin: .byte b_var0 v_in: .quad 0 # : .s depth dup . c": emit do dup while dup pick . 1- again drop ; item .s # 11 22 33 prstack: .byte b_depth # 11 22 33 3 .byte b_dup # 11 22 33 3 3 .byte b_strp .byte 2 .ascii "( " .byte b_call16 # 11 22 33 3 .word dot - . - 2 .byte b_strp # 11 22 33 3 .byte 3 .ascii "): " .byte b_dup, b_zlt .byte b_qnbranch8, 1f - . .byte b_strp .byte 14 .ascii "\nStack fault!\n" .byte b_quit 1: .byte b_dup # 11 22 33 3 3 .byte b_qnbranch8 # 11 22 33 3 .byte 2f - . .byte b_dup # 11 22 33 3 3 .byte b_pick # 11 22 33 3 11 .byte b_call16 # 11 22 33 3 .word dot - . - 2 .byte b_wm # 11 22 33 2 .byte b_branch8 .byte 1b - . 2: .byte b_drop # 11 22 33 .byte b_exit .macro prs new_line = 1 .byte b_call16 .word prstack - . - 2 .if \new_line > 0 .byte b_lit8, '\n' .byte b_emit .endif .endm .macro pr string .byte b_strp .byte 9f - 8f 8: .ascii "\n\string" 9: .endm item interpret interpret: .byte b_blword .byte b_dup .byte b_qnbranch8 .byte 0f - . .byte b_over .byte b_over .byte b_find .byte b_dup .byte b_qnbranch8 .byte 1f - . .byte b_mrot .byte b_drop .byte b_drop .byte b_cfa .byte b_execute .byte b_branch8 .byte 2f - . 1: .byte b_drop .byte b_over, b_over .byte b_numberq # ,    .byte b_qbranch8, 3f - . #     0, ,      3 .byte b_type #    .byte b_strp #   .byte 19 #     .ascii " : word not found!\n" .byte b_quit #    3: .byte b_nip, b_nip #  ,     ( b_over, b_over) 2: #       .byte b_depth #    .byte b_zlt # ,   0 ( 0<) .byte b_qnbranch8, interpret_ok - . #   ,    ,   .byte b_strp #    .byte 14 .ascii "\nstack fault!\n" .byte b_quit #    interpret_ok: .byte b_branch8 .byte interpret - . 0: .byte b_drop .byte b_exit last_item: .byte b_var0 item bye, f_code .byte b_bye .section .text .global _start #     _start: mov rbp, rsp sub rbp, stack_size lea r8, start mov init_stack, rsp mov init_rstack, rbp jmp _next b_var0 = 0x28 bcmd_var0: push r8 b_exit = 0x17 bcmd_exit: mov r8, [rbp] add rbp, 8 _next: movzx rcx, byte ptr [r8] inc r8 jmp [bcmd + rcx*8] b_num0 = 0x02 bcmd_num0: push 0 jmp _next b_num1 = 0x03 bcmd_num1: push 1 jmp _next b_num2 = 0x04 bcmd_num2: push 2 jmp _next b_num3 = 0x05 bcmd_num3: push 3 jmp _next b_num4 = 0x06 bcmd_num4: push 4 jmp _next b_num8 = 0x07 bcmd_num8: push 8 jmp _next b_lit8 = 0x08 bcmd_lit8: movsx rax, byte ptr [r8] inc r8 push rax jmp _next b_lit16 = 0x09 bcmd_lit16: movsx rax, word ptr [r8] add r8, 2 push rax jmp _next b_call8 = 0x0C bcmd_call8: movsx rax, byte ptr [r8] sub rbp, 8 inc r8 mov [rbp], r8 add r8, rax jmp _next b_call16 = 0x0D bcmd_call16: movsx rax, word ptr [r8] sub rbp, 8 add r8, 2 mov [rbp], r8 add r8, rax jmp _next b_call32 = 0x0E bcmd_call32: movsx rax, dword ptr [r8] sub rbp, 8 add r8, 4 mov [rbp], r8 add r8, rax jmp _next b_lit32 = 0x0A bcmd_lit32: movsx rax, dword ptr [r8] add r8, 4 push rax jmp _next b_lit64 = 0x0B bcmd_lit64: mov rax, [r8] add r8, 8 push rax jmp _next b_dup = 0x30 bcmd_dup: push [rsp] jmp _next b_wm = 0x20 bcmd_wm: decq [rsp] jmp _next b_wp = 0x18 bcmd_wp: incq [rsp] jmp _next b_add = 0x21 bcmd_add: pop rax add [rsp], rax jmp _next b_sub = 0x22 bcmd_sub: pop rax sub [rsp], rax jmp _next b_mul = 0x23 bcmd_mul: pop rax pop rbx imul rbx push rax jmp _next b_div = 0x24 bcmd_div: pop rbx pop rax cqo idiv rbx push rax jmp _next b_mod = 0x25 bcmd_mod: pop rbx pop rax cqo idiv rbx push rdx jmp _next b_divmod = 0x26 bcmd_divmod: pop rbx pop rax cqo idiv rbx push rdx push rax jmp _next b_abs = 0x27 bcmd_abs: mov rax, [rsp] or rax, rax jge _next neg rax mov [rsp], rax jmp _next b_drop = 0x31 bcmd_drop: add rsp, 8 jmp _next b_swap = 0x32 bcmd_swap: pop rax pop rbx push rax push rbx jmp _next b_rot = 0x33 bcmd_rot: pop rax pop rbx pop rcx push rbx push rax push rcx jmp _next b_mrot = 0x34 bcmd_mrot: pop rcx pop rbx pop rax push rcx push rax push rbx jmp _next b_over = 0x35 bcmd_over: push [rsp + 8] jmp _next b_pick = 0x36 bcmd_pick: pop rcx push [rsp + 8*rcx] jmp _next b_roll = 0x37 bcmd_roll: pop rcx mov rbx, [rsp + 8*rcx] roll1: mov rax, [rsp + 8*rcx - 8] mov [rsp + 8*rcx], rax dec rcx jnz roll1 push rbx jmp _next b_depth = 0x38 bcmd_depth: mov rax, init_stack sub rax, rsp sar rax, 3 push rax jmp _next b_nip = 0x39 bcmd_nip: pop rax mov [rsp], rax jmp _next b_get = 0x40 bcmd_get: pop rcx push [rcx] jmp _next b_set = 0x41 bcmd_set: pop rcx pop rax mov [rcx], rax jmp _next b_get8 = 0x42 bcmd_get8: pop rcx movsx rax, byte ptr [rcx] push rax jmp _next b_set8 = 0x43 bcmd_set8: pop rcx pop rax mov [rcx], al jmp _next b_get16 = 0x44 bcmd_get16: pop rcx movsx rax, word ptr [rcx] push rax jmp _next b_set16 = 0x45 bcmd_set16: pop rcx pop rax mov [rcx], ax jmp _next b_get32 = 0x46 bcmd_get32: pop rcx movsx rax, dword ptr [rcx] push rax jmp _next b_set32 = 0x47 bcmd_set32: pop rcx pop rax mov [rcx], eax jmp _next b_setp = 0x48 bcmd_setp: pop rcx pop rax add [rcx], rax jmp _next b_setm = 0x49 bcmd_setm: pop rcx pop rax sub [rcx], rax jmp _next b_2r = 0x60 bcmd_2r: pop rax sub rbp, 8 mov [rbp], rax jmp _next b_r2 = 0x61 bcmd_r2: push [rbp] add rbp, 8 jmp _next b_rget = 0x62 bcmd_rget: push [rbp] jmp _next # 0= b_zeq = 0x50 bcmd_zeq: pop rax or rax, rax jnz rfalse rtrue: push -1 jmp _next rfalse: push 0 jmp _next # 0< b_zlt = 0x51 bcmd_zlt: pop rax or rax, rax jl rtrue push 0 jmp _next # 0> b_zgt = 0x52 bcmd_zgt: pop rax or rax, rax jg rtrue push 0 jmp _next # = b_eq = 0x53 bcmd_eq: pop rbx pop rax cmp rax, rbx jz rtrue push 0 jmp _next # < b_lt = 0x54 bcmd_lt: pop rbx pop rax cmp rax, rbx jl rtrue push 0 jmp _next # > b_gt = 0x55 bcmd_gt: pop rbx pop rax cmp rax, rbx jg rtrue push 0 jmp _next # <= b_lteq = 0x56 bcmd_lteq: pop rbx pop rax cmp rax, rbx jle rtrue push 0 jmp _next # >= b_gteq = 0x57 bcmd_gteq: pop rbx pop rax cmp rax, rbx jge rtrue push 0 jmp _next b_var8 = 0x29 bcmd_var8: push r8 b_branch8 = 0x10 bcmd_branch8: movsx rax, byte ptr [r8] add r8, rax jmp _next b_var16 = 0x30 bcmd_var16: push r8 b_branch16 = 0x11 bcmd_branch16: movsx rax, word ptr [r8] add r8, rax jmp _next b_qbranch8 = 0x12 bcmd_qbranch8: pop rax or rax, rax jnz bcmd_branch8 inc r8 jmp _next b_qbranch16 = 0x13 bcmd_qbranch16: pop rax or rax, rax jnz bcmd_branch16 add r8, 2 jmp _next b_qnbranch8 = 0x14 bcmd_qnbranch8: pop rax or rax, rax jz bcmd_branch8 inc r8 jmp _next b_qnbranch16 = 0x15 bcmd_qnbranch16:pop rax or rax, rax jz bcmd_branch16 add r8, 2 jmp _next b_bad = 0x00 bcmd_bad: mov rax, 1 #    1 - sys_write mov rdi, 1 #   1  stdout mov rsi, offset msg_bad_byte #     mov rdx, msg_bad_byte_len #   syscall #   mov rax, 60 #    1 - sys_exit mov rbx, 1 #    1 syscall #   b_bye = 0x01 bcmd_bye: mov rax, 1 #    1 - sys_write mov rdi, 1 #   1  stdout mov rsi, offset msg_bye #     mov rdx, msg_bye_len #   syscall #   mov rax, 60 #    60 - sys_exit mov rdi, 0 #    0 syscall #   b_strp = 0x83 bcmd_strp: movsx rax, byte ptr [r8] inc r8 push r8 add r8, rax push rax b_type = 0x80 bcmd_type: mov rax, 1 #    1 - sys_write mov rdi, 1 #   1 - stdout pop rdx #   pop rsi #   push r8 syscall #   pop r8 jmp _next b_expect = 0x88 bcmd_expect: mov rax, 0 #    1 - sys_read mov rdi, 0 #   1 - stdout pop rdx #   pop rsi #   push r8 syscall #   pop r8 mov rbx, rax or rax, rax jge 1f xor rbx, rbx 1: mov v_span, rbx jmp _next b_str = 0x82 bcmd_str: movzx rax, byte ptr [r8] lea r8, [r8 + rax + 1] jmp _next b_count = 0x84 bcmd_count: pop rcx movzx rax, byte ptr [rcx] inc rcx push rcx push rax jmp _next b_emit = 0x81 bcmd_emit: pop rax mov rsi, offset emit_buf #   mov [rsi], al mov rax, 1 #    1 - sys_write mov rdi, 1 #   1 - stdout mov rdx, 1 #   push r8 syscall #   pop r8 jmp _next b_blword = 0xF0 bcmd_blword: mov rsi, v_tib #    mov rdx, rsi #   RDX       mov rax, v_in #     mov rcx, v_ntib #    mov rbx, rcx add rsi, rax #  RSI -      sub rcx, rax #     jz 3f word2: lodsb #   AL  RSI   cmp al, ' ' ja 1f #    (    ) dec rcx jnz word2 #    3: sub rsi, rdx mov v_in, rsi push rcx jmp _next 1: lea rdi, [rsi - 1] # RDI = RSI - 1 ( ) dec rcx jz word9 word3: lodsb cmp al, ' ' jbe 2f dec rcx jnz word3 word9: inc rsi 2: mov rax, rsi sub rsi, rdx #        (   ) cmp rsi, rbx jle 4f mov rsi, rbx 4: mov v_in, rsi sub rax, rdi dec rax jz word1 push rdi #   word1: push rax #   jmp _next b_quit = 0xF1 bcmd_quit: lea r8, quit mov rsp, init_stack mov rbp, init_rstack jmp _next b_find = 0xF2 bcmd_find: pop rbx #   pop r9 #   mov rdx, v_context mov rdx, [rdx] #        #   find0: mov al, [rdx] #  and al, 3 #   -     ,     ,    or al, al jz find_l8 cmp al, 1 jz find_l16 cmp al, 2 jz find_l32 mov r10, [rdx + 1] #  64  lea rsi, [rdx + 9] #   jmp find1 find_l32: movsx r10, dword ptr [rdx + 1] #  32  lea rsi, [rdx + 5] #   jmp find1 find_l16: movsx r10, word ptr [rdx + 1] #  16  lea rsi, [rdx + 3] #   jmp find1 find_l8: movsx r10, byte ptr [rdx + 1] #  8  lea rsi, [rdx + 2] #   find1: movzx rax, byte ptr [rsi] #       cmp rax, rbx jz find2 #      find3: or r10, r10 jz find_notfound #  ,    add rdx, r10 #     jmp find0 #  ,   find2: inc rsi mov rdi, r9 mov rcx, rax repz cmpsb jnz find3 #   push rdx jmp _next find_notfound: push r10 jmp _next b_cfa = 0xF3 bcmd_cfa: pop rdx #    mov al, [rdx] #  and al, 3 #   -     ,     ,    or al, al jz cfa_l8 cmp al, 1 jz cfa_l16 cmp al, 2 jz cfa_l32 lea rsi, [rdx + 9] #   (64  ) jmp cfa1 cfa_l32: lea rsi, [rdx + 5] #   (32  ) jmp cfa1 cfa_l16: lea rsi, [rdx + 3] #   (16  ) jmp cfa1 cfa_l8: lea rsi, [rdx + 2] #   (8  ) cfa1: xor rax, rax lodsb add rsi, rax push rsi jmp _next b_execute = 0xF4 bcmd_execute: sub rbp, 8 mov [rbp], r8 #       pop r8 #  - jmp _next b_numberq = 0xF5 bcmd_numberq: pop rcx #   pop rsi #  xor rax, rax #   xor rbx, rbx #     mov r9, v_base #  xor r10, r10 #   or rcx, rcx jz num_false mov bl, [rsi] cmp bl, '+' jnz 1f inc rsi dec rcx jz num_false jmp num0 1: cmp bl, '-' jnz num0 mov r10, 1 inc rsi dec rcx jz num_false num0: mov bl, [rsi] cmp bl, '0' jb num_false cmp bl, '9' jbe num_09 cmp bl, 'A' jb num_false cmp bl, 'Z' jbe num_AZ cmp bl, 'a' jb num_false cmp bl, 'z' ja num_false sub bl, 'a' - 10 jmp num_check num_AZ: sub bl, 'A' - 10 jmp num_check num_09: sub bl, '0' num_check: cmp rbx, r9 jge num_false mul r9 add rax, rbx inc rsi dec rcx jnz num0 or r10, r10 push rax push 1 jmp _next num_false: xor rcx, rcx push rcx jmp _next

源代码越来越大，所以我最后一次将它带到这里。

现在他的住所将在github上：https : //github.com/hal9000cc/forth64
在同一位置的bin文件夹中，您可以找到已经为Linux x64编译的版本。谁拥有Linux，您可以下载并运行。

谁拥有Windows-您可以安装WSL（Linux的Windows子系统）。我要去度假了，就这样做了。结果非常简单，大约花了5分钟，只有一分钟，它没有立即启动，必须通过PowerShell命令“打开”子系统。跟随错误消息中的链接，执行了命令，然后它起作用了。

但是，对于真正的印度人来说，还有一种在Windows下运行所有内容的方法：)做到这一点并不难，只需重做一些与系统交互的单词即可。

仅此而已！下次，我们将运行编译器。

将有机会汇编新词，会有条件，周期。实际上，有可能在或多或少的标准要塞上进行编写，将其编译为字节码并执行。好了，有可能进行更严格的测试，检查字节机的性能。

Continuation：美洲印第安人堡垒的字节机（不仅如此）（第4部分）

美洲印第安人堡垒的字节机（不仅如此）（第3部分）

热身

生成词

词汇量

输入缓冲区和字提取

单词搜索和翻译

键盘输入

总结

More articles: