Metamorphing Machine I rather be this walking metamorphosis
than having that old formed opinion about everything!

Let's build a transpiler! Part 12

This is the twelfth post in a series of building a transpiler (and one of the longest. Be warned!)
You can find the previous ones here.

Last time I said we would deal with contextual keywords. As you may know by now, they are identifiers that have special meaning in certain contexts.
For instance, you can have a Step variable and use it in a For/Next loop like this:

Dim Step As Integer
(...)
For X = 1 To 100 Step Step

The code I came out with is so convoluted I'll have to present it to you piece by piece.

Let's deal with the easy parts first:
As I said previously, we'll have to tell the "binary" dot operator apart from the "unary" one.

When we see A.B in VB, the dot means B is a member of A, and A can be a function call or a Get property returning a class or a type, or class or a type variable or argument.
But if we get A .B instead, it means B is a member of whatever is in a previous With statement, and A is either a sub, a function, or a Declare call.
So, that space between the A and the dot changes things a bit.
By the way, the same happens to the bang operator ("!").

To deal with it, we'll have a Static LastToken variable. After doing all the processing in the current token, we'll set LastToken to it, so next time, when processing a new token and it happens to be a dot or a bang, we can see if it was preceded by a space.
If this is the case, then we'll change that token's text to "~." or "~!" respectively.

Public Function TokenFrom(ByVal Scanner As Scanner) As Token
Static LastToken As Token
Dim Token As Token

Set Token = Scanner.GetToken

Select Case Token.Kind
Case tkOperator
If LastToken.Kind = tkWhiteSpace Then
If Token.Text = "." Then
Token.Text = "~."
ElseIf Token.Text = "!" Then
Token.Text = "~!"
End If
End If
End Select

Set LastToken = Token
Set TokenFrom = Token
End Function

As we are talking about dots and bangs, they play a role in fixing another glitch we have in our Scanner: We cannot have a keyword after a dot or a bang.
We'll do something similar to what we have done above: We'll add a static Downgrade variable and anytime we see a bang or a dot, we'll set it to True.
Whenever we get a keyword, we'll check if we have to "downgrade" it to a regular identifier.
Let's adapt the code above to that:

Public Function TokenFrom(ByVal Scanner As Scanner) As Token
Static LastToken As Token
Static Downgrade As Boolean
Dim Token As Token

Set Token = Scanner.GetToken

Select Case Token.Kind
Case tkOperator
Downgrade = Token.Text = "." Or Token.Text = "!"

If LastToken.Kind = tkWhiteSpace Then
If Token.Text = "." Then
Token.Text = "~."
ElseIf Token.Text = "!" Then
Token.Text = "~!"
End If
End If

Case tkKeyword
If Downgrade Then
Downgrade = False
Token.Kind = tkIdentifier
End If

Case Else
Downgrade = False
End Select

Set LastToken = Token
Set TokenFrom = Token
End Function

So far, so good. Now, let's deal with another kind of ambiguity: String and Date identifiers can be either data types or function calls.
Here is an example:

Dim S As String
Dim D As Date
S = String(50, "@")
D = Date

How can we detect it? I first tried going with "If it is followed by an opening parenthesis, then it is a function call", but got bitten by the following situation:

Function Abcd() As String()

It went wrong. So, we'll have to go with "If there's a preceding As then it is a keyword, otherwise it is a regular identifier."
By now you know the drill: Declare a static variable (WasAs), set it to True when we see an As keyword, and check it when we get a Date or String keyword to decide whether that token will remain a keyword or will be demoted to a regular identifier.

We just need to ensure to set WasAs back to False if what we got is not a Date or String.

Public Function TokenFrom(ByVal Scanner As Scanner) As Token
Static LastToken As Token
Static Downgrade As Boolean
Static WasAs As Boolean
Dim Token As Token

Set Token = Scanner.GetToken

Select Case Token.Kind
Case tkOperator
WasAs = False
Downgrade = Token.Text = "." Or Token.Text = "!"

If LastToken.Kind = tkWhiteSpace Then
If Token.Text = "." Then Token.Text = "~."
ElseIf Token.Text = "!" Then Token.Text = "~!"
End If

Case tkKeyword
If Downgrade Then
Downgrade = False
Token.Kind = tkIdentifier
Else
Select Case Token.Text
Case "As"
WasAs = True

Case "Date", "String"
If Not WasAs Then Token.Kind = tkIdentifier
End Select
End If

Case tkSoftLineBreak, tkHardLineBreak
WasAs = False

Case Else
Downgrade = False
End Select

Set LastToken = Token
Set TokenFrom = Token
End Function

The Declare statement has two contextual keywords, Lib and Alias.
We'll have a static variable (State) that will flag when we reach a Declare token. Then we will change it as we walk along with the statement and meet its contextual keywords.
If State is not set, then we know Lib and Alias are regular identifiers.
Here it is the code we'll insert in TokenFrom function:

Private Enum NarrowContext
NoContext
DeclareContext
DeclareLibContext
DeclareAliasContext
End Enum


Static State As NarrowContext
Dim Upgrade As Boolean
Dim Revoke As Boolean

Rem Inside "Select Case Token.Text":
Case "Declare"
If State = NoContext Then State = DeclareContext

Rem Inside "Select Case State":
Case DeclareContext
Upgrade = Token.Text = "PtrSafe"

If Upgrade Then
State = DeclareLibContext

ElseIf Not Upgrade Then
Upgrade = Token.Text = "Lib"
If Upgrade Then State = DeclareAliasContext
End If

Case DeclareLibContext
Upgrade = Token.Text = "Lib"
If Upgrade Then State = DeclareAliasContext

Case DeclareAliasContext
Upgrade = Token.Text = "Alias"
Revoke = True

Rem Below "End Select":
If Upgrade Then
Token.Kind = tkKeyword
If Revoke Then State = NoContext
End If

As already mentioned, For has the Step contextual keyword:

Private Enum NarrowContext
NoContext
DeclareContext
DeclareLibContext
DeclareAliasContext
ForNextContext
ForToContext
End Enum


Rem Inside "Select Case Token.Text":
Case "For"
If State = NoContext Then State = ForNextContext

Case "To"
If State = ForNextContext Then State = ForToContext

Rem Inside "Select Case State":
Case ForToContext
Upgrade = Token.Text = "Step"
Revoke = True

Options have some contextuals, too:

Private Enum NarrowContext
NoContext
DeclareContext
DeclareLibContext
DeclareAliasContext
ForNextContext
ForToContext
OptionContext
OptionCompareContext
End Enum

Rem Inside "Select Case Token.Text":
Case "Option"
If State = NoContext Then State = OptionContext

Rem Inside "Select Case State":
Case OptionContext
Upgrade = Token.Text = "Base"
If Not Upgrade Then Upgrade = Token.Text = "Explicit"

If Not Upgrade Then
Upgrade = Token.Text = "Compare"
If Upgrade Then State = OptionCompareContext
End If

Case OptionCompareContext
Upgrade = Token.Text = "Binary"
If Not Upgrade Then Upgrade = Token.Text = "Text"

If State = NoContext Then State = OptionContext

Error can be a keyword (On Error ...) or a regular identifier:

Rem Inside "Select Case Token.Text":
Case "On"
If State = NoContext Then State = OnContext

Rem Inside "Select Case State":
Case OnContext
Upgrade = Token.Text = "Error"
Revoke = True

Line, too, can be a keyword or a contextual, but we will need to read the next token to be able to decide whether Line is a keyword or not.
So, we'll have one more static variable (NextToken) to hold the next token, and we'll read it and clear it when needed:

Static NextToken As Token

If NextToken Is Nothing Then
Set Token = Scanner.GetToken
Else
Set Token = NextToken
Set NextToken = Nothing
End If

Rem Inside "Select Case State":
Case NoContext
Select Case Token.Text
Case "Line"
Set NextToken = Scanner.GetToken
Upgrade = NextToken.Kind = tkKeyword And NextToken.Text = "Input"
End Select

Let's deal with Name and Reset:

Rem Inside "Select Case Token.Text" that's inside "Case NoContext":
Case "Name", "Reset"
Set NextToken = Scanner.GetToken
Upgrade = Right$(NextToken.Text, 1) <> "="
If Upgrade Then Upgrade = SpareToken_.Kind <> tkKeyword Or SpareToken_.Text <> "As"
If Upgrade Then Upgrade = SpareToken_.Kind <> tkOperator
If Upgrade Then Upgrade = Not IsEndOfContext(SpareToken_)

And deal with Width:

Rem Inside "Select Case Token.Text" that's inside "Case NoContext":
Case "Width"
Set NextToken = Scanner.GetToken
Upgrade = NextToken.Kind = tkFileHandle

Now it comes the hard part... There's a bunch of non-keywords that act as kinda keywords in the Open statement.
There are also keywords there, but they serve a different purpose, like Write, for instance.
The Open statement is so complex that I'll draw a diagram to let you know how it must be parsed:

OpenStmt cluster_access cluster_lock_shared cluster_lock Open Open path path Open->path For For Append Append For->Append Binary Binary For->Binary Input Input For->Input Output Output For->Output Random Random For->Random Append-> Binary-> Input-> Output-> Random-> Access Access Read Read Access->Read Write Write Access->Write Read->Write Shared Shared Read->Shared Lock Lock Read->Lock As As Read->As Write->Shared Write->Lock Write->As Shared->As Read Read Lock-> Read Write Write Lock-> Write filehandle filehandle As->file handle Read ->As Read -> Write Write ->As Len Len = = Len->= ->Access ->Shared ->As start start start->Open end end path->For file handle->Len file handle->end size size =->size size->end

(Thanks adrian.ancona's post about Graphviz.)
Afraid yet?

Note that red words are contextual keywords, while blue ones are proper keywords.
Based on the diagram above, here are the rules to change state when faced with an Open statement:
Phew! The only thing left now is to code it out.
Added code is highlighted below.

Private Enum NarrowContext
NoContext
DeclareContext
DeclareLibContext
DeclareAliasContext
ForNextContext
ForToContext
OptionContext
OptionCompareContext
OnContext
[Next Keyword Is For]
[Next Keyword Is Input | Next Identifier Is Append, Binary, Output, or Random]
[Next Keyword Is As or Shared | Next Identifier Is Access]
[Next Keyword Is Access/Write | Next Identifier Is Access/Read]
[Next Keyword Is Access/Write, Lock, As, or Shared]
[Next Keyword Is Lock, As, or Shared]
[Next Keyword Is Lock/Write | Next Identifier Is Lock/Read]
[Next Keyword Is Lock/Write or As]
[Next Keyword Is As]
[Next Token Is Filehandle]
[Next Identifier Is Len]
End Enum


Public Function TokenFrom(ByVal Scanner As Scanner) As Token
Static Downgrade As Boolean
Static WasAs As Boolean
Static LastToken As Token
Static State As NarrowContext
Static NextToken As Token
Dim Upgrade As Boolean
Dim Revoke As Boolean
Dim Token As Token

If NextToken Is Nothing Then
Set Token = Scanner.GetToken
Else
Set Token = NextToken
Set NextToken = Nothing
End If

If IsEndOfContext(Token) Then
State = NoContext
Else
Select Case Token.Kind
Case tkOperator
WasAs = False
Downgrade = Token.Text = "." Or Token.Text = "!"

If LastToken.Kind = tkWhiteSpace Then
If Token.Text = "." Then
Token.Text = "~."
ElseIf Token.Text = "!" Then
Token.Text = "~!"
End If
End If

Case tkKeyword
If Downgrade Then
Downgrade = False
Token.Kind = tkIdentifier

Else
Select Case Token.Text
Case "As"
WasAs = True

Select Case State
Case [Next Keyword Is As or Shared | Next Identifier Is Access], _
[Next Keyword Is Access/Write, Lock, As, or Shared], _
[Next Keyword Is Lock, As, or Shared], _
[Next Keyword Is Lock/Write or As], _
[Next Keyword Is As]
State = [Next Token Is Filehandle]
End Select

Case "Date", "String"
If Not WasAs Then Token.Kind = tkIdentifier

Case "Declare"
If State = NoContext Then State = DeclareContext

Case "For"
If State = NoContext Then
State = ForNextContext

ElseIf State = [Next Keyword Is For] Then
State = [Next Keyword Is Input | Next Identifier Is Append, Binary, Output, or Random]
End If

Case "Input"
If State = [Next Keyword Is Input | Next Identifier Is Append, Binary, Output, or Random] Then
State = [Next Keyword Is As or Shared | Next Identifier Is Access]
End If

Case "Lock"
Select Case State
Case [Next Keyword Is Access/Write, Lock, As, or Shared], _
[Next Keyword Is Lock, As, or Shared]
State = [Next Keyword Is Lock/Write | Next Identifier Is Lock/Read]
End Select

Case "Open"
If State = NoContext Then State = [Next Keyword Is For]

Case "Option"
If State = NoContext Then State = OptionContext

Case "On"
If State = NoContext Then State = OnContext

Case "To"
If State = ForNextContext Then State = ForToContext

Case "Shared"
Select Case State
Case [Next Keyword Is As or Shared | Next Identifier Is Access], _
[Next Keyword Is Access/Write | Next Identifier Is Access/Read], _
[Next Keyword Is Lock, As, or Shared]
State = [Next Keyword Is As]
End Select

Case "Write"
Select Case State
Case [Next Keyword Is Access/Write | Next Identifier Is Access/Read], _
[Next Keyword Is Access/Write, Lock, As, or Shared]
State = [Next Keyword Is Lock, As, or Shared]

Case [Next Keyword Is Lock/Write | Next Identifier Is Lock/Read], _
[Next Keyword Is Lock/Write or As]
State = [Next Keyword Is As]
End Select
End Select
End If

Case tkIdentifier
Downgrade = False
WasAs = False

Select Case State
Case NoContext
Select Case Token.Text
Case "Line"
Set NextToken = Scanner.GetToken
Upgrade = NextToken.Kind = tkKeyword And NextToken.Text = "Input"

Case "Name"
Set NextToken = Scanner.GetToken
Upgrade = Right$(NextToken.Text, 1) <> "="

Case "Reset"
Set NextToken = Scanner.GetToken
Upgrade = IsEndOfContext(NextToken)

Case "Width"
Set NextToken = Scanner.GetToken
Upgrade = NextToken.Kind = tkFileHandle
End Select

Case OptionContext
Upgrade = Token.Text = "Base"
If Not Upgrade Then Upgrade = Token.Text = "Explicit"

If Not Upgrade Then
Upgrade = Token.Text = "Compare"
If Upgrade Then State = OptionCompareContext
End If

Case OptionCompareContext
Upgrade = Token.Text = "Binary"
If Not Upgrade Then Upgrade = Token.Text = "Text"

Case DeclareContext
Upgrade = Token.Text = "PtrSafe"

If Upgrade Then
State = DeclareLibContext

ElseIf Not Upgrade Then
Upgrade = Token.Text = "Lib"
If Upgrade Then State = DeclareAliasContext
End If

Case DeclareLibContext
Upgrade = Token.Text = "Lib"
If Upgrade Then State = DeclareAliasContext

Case DeclareAliasContext
Upgrade = Token.Text = "Alias"
Revoke = True

Case ForToContext
Upgrade = Token.Text = "Step"
Revoke = True

Case OnContext
Upgrade = Token.Text = "Error"
Revoke = True

Case [Next Keyword Is Input | Next Identifier Is Append, Binary, Output, or Random]
Upgrade = Token.Text = "Append"
If Not Upgrade Then Upgrade = Token.Text = "Binary"
If Not Upgrade Then Upgrade = Token.Text = "Output"
If Not Upgrade Then Upgrade = Token.Text = "Random"
State = [Next Keyword Is As or Shared | Next Identifier Is Access]

Case [Next Keyword Is As or Shared | Next Identifier Is Access]
Upgrade = Token.Text = "Access"

If Upgrade Then
State = [Next Keyword Is Access/Write | Next Identifier Is Access/Read]
Else
Upgrade = Token.Text = "Shared"
If Upgrade Then State = [Next Keyword Is As]
End If

Case [Next Keyword Is Access/Write, Lock, As, or Shared], _
[Next Keyword Is Lock, As, or Shared]
Upgrade = Token.Text = "Shared"
If Upgrade Then State = [Next Keyword Is As]

Case [Next Keyword Is Access/Write | Next Identifier Is Access/Read]
Upgrade = Token.Text = "Read"
If Upgrade Then State = [Next Keyword Is Access/Write, Lock, As, or Shared]

Case [Next Keyword Is Lock/Write | Next Identifier Is Lock/Read]
Upgrade = Token.Text = "Read"
If Upgrade Then State = [Next Keyword Is Lock/Write or As]

Case [Next Identifier Is Len]
Upgrade = Token.Text = "Len"
Revoke = True
End Select

Case tkFileHandle
If State = [Next Token Is Filehandle] Then State = [Next Identifier Is Len]

Case tkSoftLineBreak, tkHardLineBreak
WasAs = False
End Select

If Upgrade Then
Token.Kind = tkKeyword
If Revoke Then State = NoContext
End If
End If

Set LastToken = Token
Set TokenFrom = Token
End Function


Private Function IsEndOfContext(ByVal Token As Token) As Boolean
Dim Result As Boolean

Result = Token.Kind = tkSoftLineBreak
If Not Result Then Result = Token.Kind = tkHardLineBreak
If Not Result Then Result = Token.Kind = tkRightParenthesis
If Not Result Then Result = Token.Kind = tkListSeparator
If Not Result Then Result = Token.Kind = tkPrintSeparator

If Not Result And Token.Kind = tkKeyword Then
Result = Token.Text = "Then"
If Not Result Then Result = Token.Text = "Else"
End If

IsEndOfContext = Result
End Function

Next week we'll polish some things and fix some omissions.

Andrej Biasic
2020-10-07