Metamorphing Machine I rather be this walking metamorphosis
than having that old formed opinion about everything!

Let's build a transpiler! Part 25

This is the twenty-fifth post in a series of building a transpiler.
You can find the previous ones here.

Last time I said we would parse array's subscripts.
Let's first create a class to hold both lower and upper subscripts and change Variable class to have a list of them:

Public Class SubscriptPair
Option Explicit

Private UpperBound_ As Token

Public LowerBound As Token

Public Property Get UpperBound() As Token
Set UpperBound = UpperBound_
End Property

Public Property Set UpperBound(ByVal Value As Token)
If Not UpperBound_ Is Nothing Then Set LowerBound = UpperBound_
Set UpperBound_ = Value
End Property
End Class


Public Class Variable
(...)

Public Static Property Get Subscripts() As KeyedList
Dim Hidden As New KeyedList
Set Subscripts = Hidden
End Property
End Class

Now, we need to deal with expressions. I've been postponing it because it has been proved to be a hard topic for me.
I've had several false starts and dead-ends. I don't know if I'm too dumb or just not up to the task of taming VB's expressions.

Anyway, expressions are things that are values or produce values. We'll create a bunch of classes to deal with them, but first, let's create an interface so we can link them together later.

Public Class IExpression
Option Explicit

Public Enum ExpressionKind
ekLiteral
ekSymbol
ekFileHandle
ekTuple
ekUnaryExpr
ekBinaryExpr
End Enum

Private Sub Class_Initialize()
Err.Raise 5
End Sub

Public Property Get Kind() As ExpressionKind
End Property
End Class

VB6 does not have proper interfaces. We create a class, then other classes declare they implement that class' "interface."
In order to not let it be instantiated by mistake, we're raising an error in its constructor.

Now back to the different ways we can have expressions. Right out of the bat, there are some tokens that are values/expressions:
Numbers, strings, and date literals are expressions, as are a few keywords like True, False, Nothing, Empty, Null, Me, and the pseudo-keyword Void that we'll use for missing parameters.
To distinguish these tokens from the others, we'll create a Literal class:

Public Class Literal
Option Explicit
Implements IExpression

Public Value As Token

Private Property Get IExpression_Kind() As ExpressionKind
IExpression_Kind = ekLiteral
End Property
End Class

Identifiers and escaped identifiers are expressions, too. We'll group them into a Symbol class:

Public Class Symbol
Option Explicit
Implements IExpression

Public Value As Token

Private Property Get IExpression_Kind() As ExpressionKind
IExpression_Kind = ekSymbol
End Property
End Class

Filehandles will be represented by the FileHandle class:

Public Class FileHandle
Option Explicit
Implements IExpression

Public Value As Token

Private Property Get IExpression_Kind() As ExpressionKind
IExpression_Kind = ekFileHandle
End Property
End Class

We can have expressions composed by unary operators - like Not something...

Public Class UnaryExpression
Option Explicit
Implements IExpression

Public Operator As Operator
Public Value As IExpression

Private Property Get IExpression_Kind() As ExpressionKind
IExpression_Kind = ekUnaryExpr
End Property
End Class

...and expressions composed by binary operators, like lhs And rhs:

Public Class BinaryExpression
Option Explicit
Implements IExpression

Public LHS As IExpression
Public Operator As Operator
Public RHS As IExpression

Private Property Get IExpression_Kind() As ExpressionKind
IExpression_Kind = ekBinaryExpr
End Property
End Class

For that to work, we need an Operator class:

Public Class Operator
Option Explicit

Public Value As Token

Public Property Get IsUnary() As Boolean
Select Case Value.Code
Case opAddressOf, opNew, opNot, opTypeOf, opId, opNeg, opWithDot, opWithBang
IsUnary = True
End Select
End Property

Public Property Get IsBinary() As Boolean
IsBinary = Not IsUnary
End Property
End Class

As I'm expecting our GetExpression function to grow a bit, I'm moving it to a new Expressionist class.
Because of that, I had to change the Fail, IsBreak, and IsOp methods from Private to Friend, and also all GetExpression calls to provide a Me argument.
An important change is that any call to NextToken immediately after a call to GetExpression has to be changed to recovering the token from Expresionnist's LastToken. The pattern is:

Dim Xp As Expressionist
Set Xp = New Expressionist
(...)
Set whatever = Xp.GetExpression(Me)
Set token = NextToken
Set token = Xp.LastToken

Here it is Expressionist:

Private Class Expressionist
Option Explicit
Private LastToken_ As Token

Public Property Get LastToken() As Token
Set LastToken = LastToken_
End Property

Private Function Peek(ByVal Stack As KeyedList) As Variant
Set Peek = Stack(Stack.Count)
End Function

Private Function Pop(ByVal Stack As KeyedList) As Variant
Dim Index As Long

Index = Stack.Count
Set Pop = Stack(Index)
Stack.Remove Index
End Function

Public Function GetExpression(ByVal Parser As Parser) As IExpression
Rem TODO: Complete it.
End Function
End Class

Note that GetExpression now returns an IExpression instead of a Token.
Due to that, we'll need to update ParseConsts...

Set Token = NextToken
Set Cnt.Value = Token

Set Cnt.Value = Xp.GetExpression(Me)

... and DataType...

Public FixedLength As Token
Public FixedLength As IExpression

... and Parameter and Variable...

Public Init As Token
Public Init As IExpression

... and SubscriptPair...

Public Class SubscriptPair
Option Explicit

Private UpperBound_ As IExpression

Public LowerBound As IExpression

Public Property Get UpperBound() As IExpression
Set UpperBound = UpperBound_
End Property

Public Property Set UpperBound(ByVal Value As IExpression)
If Not UpperBound_ Is Nothing Then Set LowerBound = UpperBound_
Set UpperBound_ = Value
End Property
End Class

... and ConstConstruct and EnumerandConstruct...

Public Value As Token
Public Value As IExpression

... and ParseEnum:

Set Emd.Value = New Token
Emd.Value.Kind = tkIntegerNumber
Emd.Value.Text = CStr(Count)
Emd.Value.Suffix = "&"

Dim Lit As Literal
(...)
Set Lit = New Literal
Set Lit.Value = New Token
Lit.Value.Kind = tkIntegerNumber
Lit.Value.Text = CStr(Count)
Lit.Value.Suffix = "&"
Set Emd.Value = Lit

With that out of our way, let's deal with the code.
We will base ours on this Stack Overflow answer.
What follows is pretty much a translation from the answer to VB with these exceptions: Public Function GetExpression(ByVal Parser As Parser) As IExpression
Const EXPR_ERR = "Invalid expression"
Dim WantOperand As Boolean
Dim Cp As Integer
Dim Token As Token
Dim Sym As Symbol
Dim Lit As Literal
Dim Op As Operator
Dim Op2 As Operator
Dim OpStack As KeyedList
Dim OutStack As KeyedList
Dim Handle As FileHandle

Set OpStack = New KeyedList
Set OutStack = New KeyedList
WantOperand = True

Do
Set Token = Parser.NextToken

If WantOperand Then
WantOperand = False

Select Case Token.Kind
Case tkOperator
WantOperand = True

Select Case Token.Code
Case opSum
Token.Code = opId

Case opSubt
Token.Code = opNeg

Case opAddressOf, opNew, opNot, opTypeOf, opWithBang, opWithDot
Rem OK

Case Else
Parser.Fail Token, EXPR_ERR
End Select

Set Op = NewOperator(Token, Prefix)
OpStack.Add Op

Case tkLeftParenthesis
Rem Pseudo-operator
Set Op = NewOperator(Token)
OpStack.Add Op
WantOperand = True

Case tkIdentifier, tkEscapedIdentifier
Set Sym = New Symbol
Set Sym.Value = Token
OutStack.Add Sym

Case tkIntegerNumber, tkFloatNumber, tkSciNumber, tkBinaryNumber, _
tkOctalNumber, tkHexaNumber, tkString, tkDateTime
Set Lit = New Literal
Set Lit.Value = Token
OutStack.Add Lit

Case tkFileHandle
Set Handle = New FileHandle
Set Handle.Value = Token
OutStack.Add Handle

Case tkKeyword
Select Case Token.Code
Case kwTrue, kwFalse, kwNothing, kwEmpty, kwNull, kwMe
Set Lit = New Literal
Set Lit.Value = Token
OutStack.Add Lit

Case Else
Parser.Fail Token, EXPR_ERR
End Select

Case Else
Parser.Fail Token, EXPR_ERR
End Select
Else
If Parser.IsBreak(Token) Then
While OpStack.Count > 0
Move OpStack, OutStack
Wend

Exit Do
End If

Select Case Token.Kind
Case tkOperator
Select Case Token.Code
Case opAddressOf, opNew, opNot, opTypeOf, opCompAnd, opCompEqv, _
opCompImp, opCompMod, opCompOr, opCompXor, opCompSum, opCompSubt, opCompMul, _
opCompDiv, opCompIntDiv, opCompPow, opCompLSh, opCompRSh, opCompURSh, opCompConcat
Parser.Fail Token, EXPR_ERR
End Select

Set Op2 = NewOperator(Token)

Do While OpStack.Count > 0
Set Op = Peek(OpStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do

Cp = ComparePrecedence(Op, Op2)
If Cp = -1 Then Exit Do
Move OpStack, OutStack, Op
Loop

OpStack.Add Op2
WantOperand = True

Case tkLeftParenthesis
Rem Apply operator
Set Token = New Token
Token.Kind = tkOperator
Token.Code = opApply
Set Op = NewOperator(Token)
OpStack.Add Op

Case tkRightParenthesis
Do While OpStack.Count > 0
Set Op = Peek(OpStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do
If IsOp(Op.Value, opApply) Then Exit Do

Move OpStack, OutStack, Op
Loop

If OutStack.Count = 0 Then Parser.Fail Token, EXPR_ERR
Pop OpStack

Case tkListSeparator
Do While OpStack.Count > 0
Set Op = Pop(OpStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do

Move OpStack, OutStack, Op
Loop

If OutStack.Count = 0 Then Parser.Fail Token, EXPR_ERR
WantOperand = True

Case Else
Parser.Fail Token, EXPR_ERR
End Select
End If
Loop

Do While OutStack.Count > 1
Set Op = Pop(OutStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do

Move OpStack, OutStack, Op
Loop

Set LastToken_ = Token
Set GetExpression = Pop(OutStack)
End Function

Private Sub Move(ByVal OpStack As KeyedList, ByVal OutStack As KeyedList, Optional ByVal Op As Operator)
Dim IExpr As IExpression
Dim Uni As UnaryExpression
Dim Bin As BinaryExpression

If Op Is Nothing Then Set Op = Peek(OpStack)

If Op.IsUnary Then
Set Uni = New UnaryExpression
Set Uni.Operator = Op
Set Uni.Value = Pop(OutStack)
Set IExpr = Uni
Else
Set Bin = New BinaryExpression
Set Bin.Operator = Op
Set Bin.RHS = Pop(OutStack)
Set Bin.LHS = Pop(OutStack)
Set IExpr = Bin
End If

OutStack.Add IExpr
Pop OpStack
End Sub


Rem This goes into Globals module.
Public Function ComparePrecedence(ByVal LeftOp As Operator, ByVal RightOp As Operator) As Integer
Dim LHS As Integer
Dim RHS As Integer

LHS = Precedence(LeftOp)
RHS = Precedence(RightOp)

If LHS = RHS Then Exit Function

If LHS < RHS Then
ComparePrecedence = -1
Else
ComparePrecedence = 1
End If
End Function

Private Function Precedence(ByVal Op As Operator) As Integer
Select Case Op.Value.Code
Case opApply
Precedence = 19

Case opPow
Precedence = 18

Case opAddressOf, opNew
Precedence = 17

Case opId, opNeg, opDot, opBang, opWithDot, opWithBang, opTypeOf
Precedence = 16

Case opLSh, opRSh, opURSh
Precedence = 15

Case opMul, opDiv
Precedence = 14

Case opIntDiv
Precedence = 13

Case opMod
Precedence = 12

Case opSum, opSubt
Precedence = 11

Case opConcat
Precedence = 10

Case opGt, opGe, opEq, opLe, opLt, opNe, opIsNot, opIs, opLike, opTo
Precedence = 9

Case opNot
Precedence = 8

Case opAnd, opAndAlso
Precedence = 7

Case opOr, opOrElse
Precedence = 6

Case opXor
Precedence = 5

Case opEqv
Precedence = 4

Case opImp
Precedence = 3

Case opNamed
Precedence = 2

Case opCompSum, opCompSubt, opCompMul, opCompDiv, opCompIntDiv, opCompPow, opCompConcat, opCompLSh, _
opCompRSh, opCompURSh, opCompAnd, opCompEqv, opCompImp, opCompMod, opCompOr, opCompXor
Precedence = 1

Case Else
Debug.Assert False
End Select
End Function


Rem This goes into Messages.
Public Property Get Msg065() As String
Msg065 = "Invalid expression"
End Property

Now we will enhance GetExpression to deal with subscripts. Here it is the updated code:

Rem Add it to classe's declaration area.
Public CanHaveTo As Boolean


Public Function GetExpression(ByVal Parser As Parser) As IExpression
Dim HadTo As Boolean
Dim WantOperand As Boolean
Dim Cp As Integer
Dim Token As Token
Dim Sym As Symbol
Dim Lit As Literal
Dim Op As Operator
Dim Op2 As Operator
Dim OpStack As KeyedList
Dim OutStack As KeyedList
Dim Handle As FileHandle

Set OpStack = New KeyedList
Set OutStack = New KeyedList
WantOperand = True

Do
Set Token = Parser.NextToken

If WantOperand Then
WantOperand = False

Select Case Token.Kind
Case tkOperator
WantOperand = True

Select Case Token.Code
Case opSum
Token.Code = opId

Case opSubt
Token.Code = opNeg

Rem Remaining unary operators.
Case opAddressOf, opNew, opNot, opTypeOf, opWithBang, opWithDot
Rem OK

Case Else
Exit Do
End Select

Set Op = NewOperator(Token)
OpStack.Add Op

Case tkLeftParenthesis
Rem Pseudo-operator.
Set Op = NewOperator(Token)
OpStack.Add Op
WantOperand = True

Case tkIdentifier, tkEscapedIdentifier
Set Sym = New Symbol
Set Sym.Value = Token
OutStack.Add Sym

Case tkIntegerNumber, tkFloatNumber, tkSciNumber, tkBinaryNumber, _
tkOctalNumber, tkHexaNumber, tkString, tkDateTime
Set Lit = New Literal
Set Lit.Value = Token
OutStack.Add Lit

Case tkFileHandle
Set Handle = New FileHandle
Set Handle.Value = Token
OutStack.Add Handle

Case tkKeyword
Select Case Token.Code
Case kwTrue, kwFalse, kwNothing, kwEmpty, kwNull, kwMe
Set Lit = New Literal
Set Lit.Value = Token
OutStack.Add Lit

Case Else
Exit Do
End Select

Case Else
Exit Do
End Select
Else
If Parser.IsBreak(Token) Then
While OpStack.Count > 0
Move OpStack, OutStack
Wend

Exit Do
End If

Select Case Token.Kind
Case tkOperator
Rem Unary and compound operators.
Select Case Token.Code
Case opAddressOf, opNew, opNot, opTypeOf, opCompAnd, opCompEqv, opCompImp, _
opCompMod, opCompOr, opCompXor, opCompSum, opCompSubt, opCompMul, opCompDiv, _
opCompIntDiv, opCompPow, opCompLSh, opCompRSh, opCompURSh, opCompConcat
Parser.Fail Token, Msg065
End Select

Set Op2 = NewOperator(Token)

Do While OpStack.Count > 0
Set Op = Peek(OpStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do

Cp = ComparePrecedence(Op, Op2)
If Cp = -1 Then Exit Do
Move OpStack, OutStack, Op
Loop

OpStack.Add Op2
WantOperand = True

Case tkLeftParenthesis
Set Token = New Token
Token.Kind = tkOperator
Token.Code = opApply
Set Op = NewOperator(Token)
OpStack.Add Op

Case tkRightParenthesis
Do While OpStack.Count > 0
Set Op = Peek(OpStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do
If Parser.IsOp(Op.Value, opApply) Then Exit Do

Move OpStack, OutStack, Op
Loop

If OpStack.Count = 0 Then Exit Do

If Op.Notation = Infix Then
Op.Value.Code = opApply
Move OpStack, OutStack, Op
Else
Pop OpStack
End If

Case tkKeyword
Select Case Token.Code
Case kwTo
If Not CanHaveTo Or HadTo Then Err.Raise vbObjectError + 13
HadTo = True

Token.Kind = tkOperator
Token.Code = Parser.Scanner.Operators.IndexOf(vTo)
OpStack.Add NewOperator(Token)
WantOperand = True

Case Else
Debug.Assert False
End Select

Case tkListSeparator
Do While OpStack.Count > 0
Set Op = Pop(OpStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do

Move OpStack, OutStack, Op
Loop

If OutStack.Count = 0 Then Parser.Fail Token, EXPR_ERR
WantOperand = True

Case Else
Exit Do
End Select
End If
Loop

Set LastToken_ = Token

Do While OutStack.Count > 1
Set Op = Peek(OpStack)
If Op.Value.Kind = tkLeftParenthesis Then Exit Do

Move OpStack, OutStack, Op
Loop

If OutStack.Count = 1 Then Set GetExpression = Pop(OutStack)
End Function

You may have noticed we introduced operators To and "apply":

Public Enum OperatorConstants
(...)
opOrElse
opTo
opTypeOf
(...)
opApply
End Enum

Private Sub Class_Initialize()
(...)
Values = Array(vAddressOf, vAndAlso, vIs, vIsNot, vLike, vNew, vNot, vOrElse, vTo, vTypeOf, _
(...), "&=", "")
(...)
End Sub

Now we can change ParseDim to take variable's subscripts into consideration:

Private Sub ParseDim( _
ByVal Access As Accessibility, _
ByVal Entity As Entity, _
Optional ByVal InsideProc As Boolean, _
Optional ByVal Token As Token _
)
Dim WasArray As Boolean
Dim Var As Variable
Dim Xp As Expressionist
Dim Expr As IExpression
Dim Bin As BinaryExpression
Dim Uni As UnaryExpression
Dim Subs As SubscriptPair
Dim Lit As Literal
Dim Tkn As Token


If InsideProc Then: If Access = AccessPublic Or Access = AccessPrivate Then Fail Token, Msg063
If Token Is Nothing Then Set Token = NextToken

Set Xp = New Expressionist
Xp.CanHaveTo = True

Do
Set Var = New Variable
Var.Access = Access

If IsKw(Token, kwWithEvents) Then
If Not Entity.IsClass Then Fail Token, Msg016
If InsideProc Then Fail Token, Msg063

Var.HasWithEvents = True
Set Token = NextToken
End If

If Not IsProperId(Token, CanHaveSuffix:=True) Then Fail Token, Msg061, Msg003
Set Var.Id.Name = Token

Set Token = NextToken
WasArray = False

If Token.Kind = tkLeftParenthesis Then
Do
Set Expr = Xp.GetExpression(Me)
Set Token = Xp.LastToken

If Not Expr Is Nothing Then
Select Case Expr.Kind
Case ekLiteral, ekSymbol, ekUnaryExpr
Set Subs = New SubscriptPair
Set Subs.LowerBound = SynthLower(Entity)
Set Subs.UpperBound = Expr

Case ekBinaryExpr
Set Bin = Expr
Set Subs = New SubscriptPair

If IsOp(Bin.Operator.Value, opTo) Then
Set Subs.LowerBound = Bin.LHS
Set Subs.UpperBound = Bin.RHS
Else
Set Subs.LowerBound = SynthLower(Entity)
Set Subs.UpperBound = Expr
End If

Case Else
Fail Token, Msg065
End Select
End If

Var.Subscripts.Add Subs
If Token.Kind <> tkListSeparator Then Exit Do
Loop

If Token.Kind <> tkRightParenthesis And _
Xp.LastToken.Kind <> tkRightParenthesis Then _
Fail Token, Msg057

WasArray = True
Set Token = NextToken
End If

If IsKw(Token, kwAs) Then
If Var.Id.Name.Suffix <> vbNullChar Then Fail Token, Msg024
Set Token = NextToken

If IsOp(Token, kwNew) Then
Var.HasNew = True
Set Token = NextToken
End If

If Not IsProperDataType(Token) Then Fail Token, Msg061, Msg025
Set Var.DataType = NewDataType(Token)

If Var.HasNew And Var.DataType.Id.Name.Kind = tkKeyword Then _
Fail Token, Msg062, Msg059

Set Token = NextToken

If IsOp(Token, opDot) Then
Set Token = NextToken

If Not IsProperDataType(Token) Then Fail Token, Msg061, Msg003
Set Var.DataType.Id.Name = Token

Set Token = NextToken
End If

ElseIf Var.Id.Name.Suffix <> vbNullChar Then
Set Var.DataType = FromChar(Var.Id.Name.Suffix)

Else
Set Var.DataType = Entity.DefTypes(NameOf(Var.Id.Name))
End If

Var.DataType.IsArray = WasArray
If Var.HasNew And Var.DataType.IsArray Then Fail Token, Msg064

If IsOp(Token, opEq) Then
Set Var.Init = Token
Set Var.Init = Xp.GetExpression(Me)
Set Token = GetExpression
Set Token = Xp.LastToken
End If

Entity.Vars.Add Var, NameOf(Var.Id.Name)

If IsBreak(Token) Then Exit Do
If Token.Kind <> tkListSeparator Then Fail Token, Msg061, ","
Set Token = NextToken
Loop
End Sub

It depends on two utilities: SynthLower and NewOperator:

Rem This belongs to Parser
Private Function SynthLower(ByVal Entity As Entity) As IExpression
Dim Lit As Literal
Dim Tkn As Token

Set Tkn = New Token
Tkn.Kind = tkIntegerNumber
Tkn.Text = CStr(Entity.OptionBase)

Set Lit = New Literal
Set Lit.Value = Tkn

Set SynthLower = Lit
End Function


Rem This belongs to Globals module
Public Function NewOperator(ByVal Token As Token) As Operator
Dim Result As Operator

Set Result = New Operator
Set Result.Value = Token
Set NewOperator = Result
End Function

Finally we are able to parse something like "Private Letters_(0 To LAST_INDEX) As Token", but not "ReDim CodePoints_(0 To Size \ SizeOf(kwInteger) - 1) As Integer".
We are not able to deal with function calls yet.

Next week, we'll parse Types and wrap up with the declaration area.

Andrej Biasic
2021-02-10