Metamorphing Machine I rather be this walking metamorphosis
than having that old formed opinion about everything!

Let's build a transpiler! Part 29

This is the twenty-ninth post in a series of building a transpiler.
You can find the previous ones here.

Last time I said we'd parse the If statement.
Its two most common syntaxes are:

If condition Then statement [Else statement]

If condition Then
  [statement block]
[ElseIf condition Then
  [statement block]]
[ElseIf ...] 
[Else
  [statement block]]
End If

But we're talking about Visual Basic which has the colon statement combinator.
For starters, this is completely legal (although useless:)

If SomeCondition Then:

There's no need to have something after the colon.
Or you can have several statements after it, each one separated from the next by its colon:

If SomeCondition Then: Statement1: Statement2: Statement3

It is equivalent to:

If SomeCondition Then
Statement1
Statement2
Statement3
End If

And then you can have this:

If SomeCondition Then
(...)
ElseIf SomeOtherCondition Then Statement1
Statement2
(...)
End If

What Statement1 is doing there in that place? It was supposed to be a line break!!!
And you can go on and on using colons:

If SomeCondition Then
(...)
ElseIf SomeOtherCondition Then Statement1: Statement2: Statement3
Statement4
(...)
End If

Oh, well...
This is the full graph to parse an If statement:

IfGraph start start If If start->If end end condition1 condition If->condition1 ElseIf ElseIf condition2 condition ElseIf->condition2 End If End If End If->end Then1 Then condition1->Then1 Then2 Then condition2->Then2 hard1 line break Then1->hard1 soft1 : Then1->soft1 statement1 statement Then1->statement1 Then2->hard1 statement2 statement block Then2->statement2 block1 statement block hard1->block1 hard2 line break block2 statement block hard2->block2 hard3 line break hard3->end soft1->hard3 soft1->statement1 soft2 : soft2->hard1 soft2->statement2 soft3 : soft3->hard3 statement3 statement soft3->statement3 statement1->hard3 statement1->soft1 Else2 Else statement1->Else2 statement2->hard1 statement2->soft2 statement3->hard3 statement3->soft3 block1->ElseIf block1->End If Else1 Else block1->Else1 block2->End If Else1->hard2 Else2->soft3 Else2->statement3
Not any run-of-the-mill if for sure...
Let's start by creating an IfArm class to represent an If condition and its associated block of statements:

Public Class IfArm
Option Explicit

Private Body_ As KeyedList

Public Condition As IExpression

Private Sub Class_Initialize()
Set Body_ = New KeyedList
Set Body_.T = New StmtValidator
End Sub

Public Property Get Body() As KeyedList
Set Body = Body_
End Property
End Class

Then we'll modify IfConstruct to have a collection of IfArms and an ElseBody:

Public Class IfConstruct
Option Explicit
Implements IStmt

Private Arms_ As KeyedList
Private ElseBody_ As KeyedList

Private Sub Class_Initialize()
Set Arms_ = New KeyedList
Set Arms_.T = NewValidator(TypeName(New IfArm))

Set ElseBody_ = New KeyedList
Set ElseBody_.T = New StmtValidator
End Sub

Public Property Get Arms() As KeyedList
Set Arms = Arms_
End Property

Public Property Get ElseBody() As KeyedList
Set ElseBody = ElseBody_
End Property

Private Property Get IStmt_Kind() As StmtNumbers
IStmt_Kind = snIf
End Property
End Class

Arms will always have at least one item, that's the "opening" If.
Any ElseIfs will come after that first item.
And if ElseBody.Count is greater than zero, then we know the If statement had an Else clause.

This is our first second* third** attempt at parsing all of it:

Private Function ParseIf(ByVal Entity As Entity, ByVal Body As KeyedList) As Token
Dim Arm As IfArm
Dim Token As Token
Dim Whether As IfConstruct
Dim Xp As Expressionist

Set Xp = New Expressionist
Xp.FullMode = True

Set Whether = New IfConstruct

Set Arm = New IfArm
Rem If <condition> ?
Set Arm.Condition = Xp.GetExpression(Me)
If Arm.Condition Is Nothing Then Fail Token, Msg065

Rem If <condition> Then ?
Set Token = Xp.LastToken
If Not Token.IsKeyword(kwThen) Then Fail Token, Msg088, NameBank(kwThen)

Whether.Arms.Add Arm
Set Token = NextToken

If Token.Kind = tkSoftLineBreak Then
Rem If <condition> Then :
Do
Set Token = NextToken
If IsHardBreak(Token) Then Exit Do
Up: If Not IsStatement(Token) Then Fail Token, Msg087

Rem If <condition> Then : <statement>
Set Token = ParseBody(Entity, Arm.Body, SingleLine:=True, LookAhead:=Token)
Loop While Token.Kind = tkSoftLineBreak

If Token.IsKeyword(kwElse) Then
Rem If <condition> Then : <statement> Else
Set Token = NextToken

Do
If Token.Kind = tkSoftLineBreak Then Set Token = NextToken
If Not IsStatement(Token) Then Fail Token, Msg087

Set Token = ParseBody(Entity, Whether.ElseBody, SingleLine:=True, LookAhead:=Token)
Loop While Token.Kind = tkSoftLineBreak
End If

If Not IsHardBreak(Token) Then Fail Token, Msg031

ElseIf IsHardBreak(Token) Then
Set Token = ParseBody(Entity, Arm.Body)
If Token.Kind <> tkKeyword Then Fail Token, Msg089

Do
Select Case Token.Code
Case kwElseIf
Set Arm = New IfArm
Set Arm.Condition = Xp.GetExpression(Me)
If Arm.Condition Is Nothing Then Fail Token, Msg065

Set Token = Xp.LastToken
If Not Token.IsKeyword(kwThen) Then Fail Token, Msg088, NameBank(kwThen)

Set Token = ParseBody(Entity, Arm.Body)
Whether.Arms.Add Arm

Case kwElse
Set Token = NextToken
If Not IsHardBreak(Token) Then Fail Token, Msg027

Set Token = ParseBody(Entity, Whether.ElseBody)

If Token.IsKeyword(kwIf) Then
Set Token = NextToken
Exit Do
End If

Fail Token, Msg085 & NameBank(kwIf)

Case kwIf
Set Token = NextToken
Exit Do

Case Else
Fail Token, Msg089
End Select
Loop

ElseIf IsStatement(Token) Then
GoTo Up

Else
Fail Token, Msg090
End If

Body.Add Whether
Set ParseIf = Token
End Function

For it to work we had to change ParseConsts. Now it returns the last token it read.

Private Function ParseConsts( _
ByVal Access As Accessibility, _
ByVal Entity As Entity, _
ByVal Body As KeyedList, _
Optional ByVal InsideProc As Boolean _
) As Token
Dim Name As String
Dim Token As Token
Dim Cnt As ConstConstruct
Dim Xp As New Expressionist

Debug.Assert Not Entity Is Nothing

Do
Rem Get Const's name
Set Token = SkipLineBreaks
If Not IsProperId(Token) Then Fail Token, Msg023, Msg003

Set Cnt = New ConstConstruct
Cnt.Access = Access
Set Cnt.Id = NewId(Token)

Set Token = NextToken

Rem Do we have an As clause?
If Token.IsKeyword(kwAs) Then
If Token.Suffix <> vbNullChar Then Fail Token, Msg024

Rem Get Const's data type name
Set Token = NextToken
If Not IsConstDataType(Token) Then Fail Token, Msg023, Msg025

Set Cnt.DataType = NewDataType(Token)
Set Token = NextToken

If Token.IsOperator(opMul) Then
If Cnt.DataType.Id.Name <> vString Then Fail Token, Msg026

Set Cnt.DataType.FixedLength = Xp.GetExpression(Me)
Set Token = Xp.LastToken
If Cnt.DataType.FixedLength Is Nothing Then Fail Token, Msg065
End If

ElseIf Cnt.Id.Name.Suffix <> vbNullChar Then
Rem Assign DataType property based on type sufix
Set Cnt.DataType = FromChar(Cnt.Id.Name.Suffix)
End If

Rem Discard "="
If Not Token.IsOperator(opEq) Then Fail Token, Msg023, "="

Rem Get Const's value
Set Cnt.Value = Xp.GetExpression(Me)
If Cnt.Value Is Nothing Then Fail Token, Msg065

Rem Ensure it's not a duplicated Const
If Not InsideProc Then CheckDupl Entity, Cnt.Id.Name
Name = NameOf(Cnt.Id.Name)
If Body.Exists(Name) Then Fail Cnt.Id.Name, Msg006 & Name

If Cnt.DataType Is Nothing Then
Rem TODO: Infer its data type
End If

Rem Save it
Body.AddKeyValue NameOf(Cnt.Id.Name), Cnt

Rem Move on
Set Token = Xp.LastToken

If IsBreak(Token) Then Exit Do
If InsideProc And Token.IsKeyword(kwElse) Then Exit Do

If Token.Kind <> tkListSeparator Then Fail Token, Msg023, Msg027
Loop

Set ParseConsts = Token
End Function

This is so ParseBody can return it, too.

Private Function ParseBody( _
ByVal Entity As Entity, _
ByVal Body As KeyedList, _
ByVal ClosingToken As Long, _
Optional ByVal SingleLine As Boolean, _
Optional ByVal LookAhead As Token _
) As Token
Dim Token As Token
Dim LinNum As LineNumberConstruct
Dim Label As LabelConstruct

Do
If LookAhead Is Nothing Then
Set Token = SkipLineBreaks
Else
Set Token = LookAhead
Set LookAhead = Nothing
If IsBreak(Token) Then Set Token = SkipLineBreaks
End If

Rem Do we have a line number?
If Token.Kind = tkIntegerNumber Then
Set LinNum = New LineNumberConstruct
Set LinNum.Value = Token
Body.Add LinNum
Set Token = NextToken
End If

Rem Do we have a label?
If Token.Kind = tkIdentifier Then
Set LookAhead = NextToken

If LookAhead.Kind = tkSoftLineBreak Then
Set Label = New LabelConstruct
Set Label.Id = NewId(Token)
Body.Add Label
Set LookAhead = Nothing
Set Token = NextToken
End If
End If

Select Case Token.Kind
Case tkKeyword
Select Case Token.Code
Case kwEnd
Rem Is it the End statement of End Sub, Function, or Property?
Set LookAhead = NextToken

If LookAhead.IsKeyword(ClosingToken) Then
Set Token = LookAhead
Exit Do
End If

If LookAhead.Kind = tkIdentifier And LookAhead.Code = cxProperty Then
Set Token = LookAhead
Exit Do
End If

Body.Add New EndConstruct

Case kwDim
ParseDim acLocal, Entity, Body, InsideProc:=True

Case kwStatic
ParseDim acLocal, Entity, Body, InsideProc:=True, IsStatic:=True

Case kwConst
Set Token = ParseConsts(acLocal, Entity, Body, InsideProc:=True)

Case kwIf
ParseIf Entity, Body

Case kwElseIf, kwElse
Exit Do

Case Else
Stop
End Select

Case tkIdentifier
'TODO Complete

Case tkEndOfStream
Exit Do

Case Else
Stop
End Select

If SingleLine Then Exit Do
Loop

Set ParseBody = Token
End Function


Rem Add it to Messages

Public Property Get Msg085() As String
Msg085 = "Expected: End "
End Property

Public Property Get Msg087() As String
Msg087 = "Expected: statement"
End Property

Public Property Get Msg088() As String
Msg088 = "Rule: If condition Then"
End Property

Public Property Get Msg089() As String
Msg089 = "Expected: Else or ElseIf or End If"
End Property

Public Property Get Msg090() As String
Msg090 = "Block If without End If"
End Property

And we created an IsHardBreak function. IsBreak makes no distinction between soft breaks and hard breaks, and now we need to tell them apart:

Friend Function IsHardBreak(ByVal Token As Token) As Boolean
Debug.Assert Not Token Is Nothing

IsHardBreak = Token.Kind = tkHardLineBreak Or Token.Kind = tkComment
End Function

This was much harder than I thought it would be...
Next week we'll parse Select Cases.

* My first attempt did not survive to a bunch of tests.
This is the second one and, yes, it has a GoTo there. Do I like it? No. Will I change it? Not for now.
Sorry if it sounded a little grumpy. This ParseIf took me too much time to come up with and I'm not 100% certain it fits the bill.
Only time will tell.

** Bug fix.

Andrej Biasic
2021-03-17