Let's build a transpiler! Part 29
This is the twenty-ninth post in a series of building a transpiler.You can find the previous ones here.
Last time I said we'd parse the If statement.
Its two most common syntaxes are:
If condition Then statement [Else statement]↩
If condition Then↩
[statement block↩]
[ElseIf condition Then↩
[statement block]↩]
[ElseIf ...]
[Else↩
[statement block↩]]
End If↩
But we're talking about Visual Basic which has the colon statement combinator.
For starters, this is completely legal (although useless:)
If SomeCondition Then:
There's no need to have something after the colon.
Or you can have several statements after it, each one separated from the next by its colon:
If SomeCondition Then: Statement1: Statement2: Statement3
It is equivalent to:
If SomeCondition Then
Statement1
Statement2
Statement3
End If
And then you can have this:
If SomeCondition Then
(...)
ElseIf SomeOtherCondition Then Statement1
Statement2
(...)
End If
What Statement1 is doing there in that place? It was supposed to be a line break!!!
And you can go on and on using colons:
If SomeCondition Then
(...)
ElseIf SomeOtherCondition Then Statement1: Statement2: Statement3
Statement4
(...)
End If
Oh, well...
This is the full graph to parse an If statement:
Not any run-of-the-mill if for sure...
Let's start by creating an IfArm class to represent an If condition and its associated block of statements:
Public Class IfArm
Option Explicit
Private Body_ As KeyedList
Public Condition As IExpression
Private Sub Class_Initialize()
Set Body_ = New KeyedList
Set Body_.T = New StmtValidator
End Sub
Public Property Get Body() As KeyedList
Set Body = Body_
End Property
End Class
Then we'll modify IfConstruct to have a collection of IfArms and an ElseBody:
Public Class IfConstruct
Option Explicit
Implements IStmt
Private Arms_ As KeyedList
Private ElseBody_ As KeyedList
Private Sub Class_Initialize()
Set Arms_ = New KeyedList
Set Arms_.T = NewValidator(TypeName(New IfArm))
Set ElseBody_ = New KeyedList
Set ElseBody_.T = New StmtValidator
End Sub
Public Property Get Arms() As KeyedList
Set Arms = Arms_
End Property
Public Property Get ElseBody() As KeyedList
Set ElseBody = ElseBody_
End Property
Private Property Get IStmt_Kind() As StmtNumbers
IStmt_Kind = snIf
End Property
End Class
Arms will always have at least one item, that's the "opening" If.
Any ElseIfs will come after that first item.
And if ElseBody.Count is greater than zero, then we know the If statement had an Else clause.
This is our
Private Function ParseIf(ByVal Entity As Entity, ByVal Body As KeyedList) As Token
Dim Arm As IfArm
Dim Token As Token
Dim Whether As IfConstruct
Dim Xp As Expressionist
Set Xp = New Expressionist
Xp.FullMode = True
Set Whether = New IfConstruct
Set Arm = New IfArm
Rem If <condition> ?
Set Arm.Condition = Xp.GetExpression(Me)
If Arm.Condition Is Nothing Then Fail Token, Msg065
Rem If <condition> Then ?
Set Token = Xp.LastToken
If Not Token.IsKeyword(kwThen) Then Fail Token, Msg088, NameBank(kwThen)
Whether.Arms.Add Arm
Set Token = NextToken
If Token.Kind = tkSoftLineBreak Then
Rem If <condition> Then :
Do
Set Token = NextToken
If IsHardBreak(Token) Then Exit Do
Up: If Not IsStatement(Token) Then Fail Token, Msg087
Rem If <condition> Then : <statement>
Set Token = ParseBody(Entity, Arm.Body, SingleLine:=True, LookAhead:=Token)
Loop While Token.Kind = tkSoftLineBreak
If Token.IsKeyword(kwElse) Then
Rem If <condition> Then : <statement> Else
Set Token = NextToken
Do
If Token.Kind = tkSoftLineBreak Then Set Token = NextToken
If Not IsStatement(Token) Then Fail Token, Msg087
Set Token = ParseBody(Entity, Whether.ElseBody, SingleLine:=True, LookAhead:=Token)
Loop While Token.Kind = tkSoftLineBreak
End If
If Not IsHardBreak(Token) Then Fail Token, Msg031
ElseIf IsHardBreak(Token) Then
Set Token = ParseBody(Entity, Arm.Body)
If Token.Kind <> tkKeyword Then Fail Token, Msg089
Do
Select Case Token.Code
Case kwElseIf
Set Arm = New IfArm
Set Arm.Condition = Xp.GetExpression(Me)
If Arm.Condition Is Nothing Then Fail Token, Msg065
Set Token = Xp.LastToken
If Not Token.IsKeyword(kwThen) Then Fail Token, Msg088, NameBank(kwThen)
Set Token = ParseBody(Entity, Arm.Body)
Whether.Arms.Add Arm
Case kwElse
Set Token = NextToken
If Not IsHardBreak(Token) Then Fail Token, Msg027
Set Token = ParseBody(Entity, Whether.ElseBody)
If Token.IsKeyword(kwIf) Then
Set Token = NextToken
Exit Do
End If
Fail Token, Msg085 & NameBank(kwIf)
Case kwIf
Set Token = NextToken
Exit Do
Case Else
Fail Token, Msg089
End Select
Loop
ElseIf IsStatement(Token) Then
GoTo Up
Else
Fail Token, Msg090
End If
Body.Add Whether
Set ParseIf = Token
End Function
For it to work we had to change ParseConsts. Now it returns the last token it read.
Private Function ParseConsts( _
ByVal Access As Accessibility, _
ByVal Entity As Entity, _
ByVal Body As KeyedList, _
Optional ByVal InsideProc As Boolean _
) As Token
Dim Name As String
Dim Token As Token
Dim Cnt As ConstConstruct
Dim Xp As New Expressionist
Debug.Assert Not Entity Is Nothing
Do
Rem Get Const's name
Set Token = SkipLineBreaks
If Not IsProperId(Token) Then Fail Token, Msg023, Msg003
Set Cnt = New ConstConstruct
Cnt.Access = Access
Set Cnt.Id = NewId(Token)
Set Token = NextToken
Rem Do we have an As clause?
If Token.IsKeyword(kwAs) Then
If Token.Suffix <> vbNullChar Then Fail Token, Msg024
Rem Get Const's data type name
Set Token = NextToken
If Not IsConstDataType(Token) Then Fail Token, Msg023, Msg025
Set Cnt.DataType = NewDataType(Token)
Set Token = NextToken
If Token.IsOperator(opMul) Then
If Cnt.DataType.Id.Name <> vString Then Fail Token, Msg026
Set Cnt.DataType.FixedLength = Xp.GetExpression(Me)
Set Token = Xp.LastToken
If Cnt.DataType.FixedLength Is Nothing Then Fail Token, Msg065
End If
ElseIf Cnt.Id.Name.Suffix <> vbNullChar Then
Rem Assign DataType property based on type sufix
Set Cnt.DataType = FromChar(Cnt.Id.Name.Suffix)
End If
Rem Discard "="
If Not Token.IsOperator(opEq) Then Fail Token, Msg023, "="
Rem Get Const's value
Set Cnt.Value = Xp.GetExpression(Me)
If Cnt.Value Is Nothing Then Fail Token, Msg065
Rem Ensure it's not a duplicated Const
If Not InsideProc Then CheckDupl Entity, Cnt.Id.Name
Name = NameOf(Cnt.Id.Name)
If Body.Exists(Name) Then Fail Cnt.Id.Name, Msg006 & Name
If Cnt.DataType Is Nothing Then
Rem TODO: Infer its data type
End If
Rem Save it
Body.AddKeyValue NameOf(Cnt.Id.Name), Cnt
Rem Move on
Set Token = Xp.LastToken
If IsBreak(Token) Then Exit Do
If InsideProc And Token.IsKeyword(kwElse) Then Exit Do
If Token.Kind <> tkListSeparator Then Fail Token, Msg023, Msg027
Loop
Set ParseConsts = Token
End Function
This is so ParseBody can return it, too.
Private Function ParseBody( _
ByVal Entity As Entity, _
ByVal Body As KeyedList, _
ByVal ClosingToken As Long, _
Optional ByVal SingleLine As Boolean, _
Optional ByVal LookAhead As Token _
) As Token
Dim Token As Token
Dim LinNum As LineNumberConstruct
Dim Label As LabelConstruct
Do
If LookAhead Is Nothing Then
Set Token = SkipLineBreaks
Else
Set Token = LookAhead
Set LookAhead = Nothing
If IsBreak(Token) Then Set Token = SkipLineBreaks
End If
Rem Do we have a line number?
If Token.Kind = tkIntegerNumber Then
Set LinNum = New LineNumberConstruct
Set LinNum.Value = Token
Body.Add LinNum
Set Token = NextToken
End If
Rem Do we have a label?
If Token.Kind = tkIdentifier Then
Set LookAhead = NextToken
If LookAhead.Kind = tkSoftLineBreak Then
Set Label = New LabelConstruct
Set Label.Id = NewId(Token)
Body.Add Label
Set LookAhead = Nothing
Set Token = NextToken
End If
End If
Select Case Token.Kind
Case tkKeyword
Select Case Token.Code
Case kwEnd
Rem Is it the End statement of End Sub, Function, or Property?
Set LookAhead = NextToken
If LookAhead.IsKeyword(ClosingToken) Then
Set Token = LookAhead
Exit Do
End If
If LookAhead.Kind = tkIdentifier And LookAhead.Code = cxProperty Then
Set Token = LookAhead
Exit Do
End If
Body.Add New EndConstruct
Case kwDim
ParseDim acLocal, Entity, Body, InsideProc:=True
Case kwStatic
ParseDim acLocal, Entity, Body, InsideProc:=True, IsStatic:=True
Case kwConst
Set Token = ParseConsts(acLocal, Entity, Body, InsideProc:=True)
Case kwIf
ParseIf Entity, Body
Case kwElseIf, kwElse
Exit Do
Case Else
Stop
End Select
Case tkIdentifier
'TODO Complete
Case tkEndOfStream
Exit Do
Case Else
Stop
End Select
If SingleLine Then Exit Do
Loop
Set ParseBody = Token
End Function
Rem Add it to Messages
Public Property Get Msg085() As String
Msg085 = "Expected: End "
End Property
Public Property Get Msg087() As String
Msg087 = "Expected: statement"
End Property
Public Property Get Msg088() As String
Msg088 = "Rule: If condition Then"
End Property
Public Property Get Msg089() As String
Msg089 = "Expected: Else or ElseIf or End If"
End Property
Public Property Get Msg090() As String
Msg090 = "Block If without End If"
End Property
And we created an IsHardBreak function. IsBreak makes no distinction between soft breaks and hard breaks, and now we need to tell them apart:
Friend Function IsHardBreak(ByVal Token As Token) As Boolean
Debug.Assert Not Token Is Nothing
IsHardBreak = Token.Kind = tkHardLineBreak Or Token.Kind = tkComment
End Function
This was much harder than I thought it would be...
Next week we'll parse Select Cases.
* My first attempt did not survive to a bunch of tests.
This is the second one and, yes, it has a GoTo there. Do I like it? No. Will I change it? Not for now.
Sorry if it sounded a little grumpy. This ParseIf took me too much time to come up with and I'm not 100% certain it fits the bill.
Only time will tell.
** Bug fix.
Andrej Biasic
2021-03-17