Metamorphing Machine I rather be this walking metamorphosis
than having that old formed opinion about everything!

Let' build a transpiler! Part 2

This is the second post on a series about building a transpiler.
Here is the previous one.

After dealing with identifiers and keywords, let's get to literals.
There are at least three types of literals:
Let's start with the latter, as it is the simpler one.
Strings are contained by quotes and between them, we can have almost anything, including another quote - as long as it is doubled, so we know we did not reach the closing one yet. But it cannot have a line break. Visual Basic is a line-oriented programming language. Statements end with a line break, so it cannot be embedded in a string.

Building over our previous code, a function to read literal strings can be something like this:

Private Function ReadString(ByVal FileHandle As Integer) As String
Const MAX_LENGTH = 1013
Dim Buffer As String * MAX_LENGTH
Dim Ch As String * 1
Dim Count As Integer
Dim Cp As Integer
Dim Pos As Long

Do
If Count = MAX_LENGTH Then Err.Raise vbObjectError + 13, , "String too long"

If EOF(FileHandle) Then
Ch = vbCr
Else
GoSub GetChar
End If

Select Case Ch
Case """"
If EOF(FileHandle) Then Exit Do
GoSub GetChar

If Ch = """" Then
GoSub Append
Else
Rem We read too much. Let's put it "back".
Pos = Seek(FileHandle)
Seek #FileHandle, Pos - 2
Exit Do
End If

Case vbCr, vbLf
Err.Raise vbObjectError + 13, , "Unclosed string"

Case Else
GoSub Append
End Select
Loop

ReadString = Left$(Buffer, Count)
Exit Function

Append:
Count = Count + 1
Mid$(Buffer, Count, 1) = Ch
Return

GetChar:
Get #FileHandle, , Cp
Ch = ToChar(Cp)
Return
End Function

Now, let's handle the numbers. We have a few different formats to deal with:
Also, we need to take into consideration type suffixes. More about it later.
Low hanging fruits first: Reading integers.

Private Function ReadInteger(ByVal FileHandle As Integer, Optional ByVal FirstDigit As String) As String
Const MAX_LENGTH = 29
Dim Buffer As String * MAX_LENGTH
Dim Ch As String * 1
Dim Count As Integer
Dim Cp As Integer
Dim Pos As Long

If FirstDigit >= "0" And FirstDigit <= "9" Then
Count = 1
Mid$(Buffer, Count, 1) = FirstDigit
End If

Do Until EOF(FileHandle)
If Count = MAX_LENGTH Then Err.Raise vbObjectError + 13, , "Literal too long"
Get #FileHandle, , Cp
Ch = ToChar(Cp)

Select Case Ch
Case "0" To "9"
Count = Count + 1
Mid$(Buffer, Count, 1) = Ch

Case Else
Pos = Seek(FileHandle)
Seek #FileHandle, Pos - 2
Exit Do
End Select
Loop

ReadInteger = Left$(Buffer, Count)
End Function

Being able to scan an integer literal, it makes reading floats easier. The function below will scan either an integer or a float literal.

Private Function ReadFloat(ByVal FileHandle As Integer, ByVal FirstDigit As String) As String
Dim Result As String
Dim FracPart As String
Dim Cp As Integer
Dim Ch As String * 1
Dim Pos As Long

Result = ReadInteger(FileHandle, FirstDigit:=FirstDigit)

If Not EOF(FileHandle) Then
Get #FileHandle, , Cp
Ch = ToChar(Cp)

If Ch = "." Then
FracPart = ReadInteger(FileHandle)
If FracPart = "" Then Err.Raise vbObjectError + 13, , "Invalid literal"
Result = Result & "." & FracPart
Else
Pos = Seek(FileHandle)
Seek #FileHandle, Pos - 2
End If
End If

ReadFloat = Result
End Function

Now it is even easier to scan a literal number in scientific notation:

Private Function ReadNumber(ByVal FileHandle As Integer, ByVal FirstDigit As String) As String
Dim Result As String
Dim FracPart As String
Dim Cp As Integer
Dim Ch As String * 1
Dim Sg As String * 1
Dim Pos As Long

Result = ReadFloat(FileHandle, FirstDigit)

If Not EOF(FileHandle) Then
Get #FileHandle, , Cp
Ch = ToChar(Cp)

Select Case Ch
Case "e", "E"
If EOF(FileHandle) Then
GoSub UngetChar
Else
Get #FileHandle, , Cp
Sg = ToChar(Cp)

If Sg = "-" Or Sg = "+" Then
Ch = ""
Else
Ch = Sg
Sg = "+"
End If

FracPart = ReadInteger(FileHandle, FirstDigit:=Ch)
If FracPart = "" Then Err.Raise vbObjectError + 13, , "Invalid literal"
Result = Result & "E" & Sg & FracPart
End If

Case Else
GoSub UngetChar
End Select
End If

ReadNumber = Result
Exit Function

UngetChar:
Pos = Seek(FileHandle)
Seek #FileHandle, Pos - 2
Return
End Function

Now, to octal and hexadecimal literals. Both start with an ampersand, so we need to figure out what's next:

Private Function ReadAmpersand(ByVal FileHandle As Integer) As String
Dim Cp As Integer
Dim Ch As String * 1
Dim Result As String

If EOF(FileHandle) Then GoTo Fail
Get #FileHandle, , Cp
Ch = ToChar(Cp)

Select Case Ch
Case "o", "O"
Result = ReadOctal(FileHandle)

Case "h", "H"
Result = ReadHexa(FileHandle)

Case Else
GoTo Fail
End Select

ReadAmpersand = Result
Exit Function

Fail:
Err.Raise vbObjectError + 13, , "Invalid literal"
End Function

Private Function ReadOctal(ByVal FileHandle As Integer) As String
Const MAX_LENGTH = 32
Dim Buffer As String * MAX_LENGTH
Dim Count As Integer
Dim Cp As Integer
Dim Ch As String * 1
Dim Pos As Long

Do While Not EOF(FileHandle)
If Count = MAX_LENGTH Then Err.Raise vbObjectError + 13, , "Literal too long"
Get #FileHandle, , Cp
Ch = ToChar(Cp)

Select Case Ch
Case "0" To "7"
Count = Count + 1
Mid$(Buffer, Count, 1) = Ch

Case Else
GoSub UngetChar
Exit Do
End Select
Loop

If Count = 0 Then Err.Raise vbObjectError + 13, , "Invalid literal"
ReadOctal = Left$(Buffer, Count)
Exit Function

UngetChar:
Pos = Seek(FileHandle)
Seek #FileHandle, Pos - 2
Return
End Function

Private Function ReadHexa(ByVal FileHandle As Integer) As String
Const MAX_LENGTH = 24
Dim Buffer As String * MAX_LENGTH
Dim Count As Integer
Dim Cp As Integer
Dim Ch As String * 1
Dim Pos As Long

Do While Not EOF(FileHandle)
If Count = MAX_LENGTH Then Err.Raise vbObjectError + 13, , "Literal too long"
Get #FileHandle, , Cp
Ch = ToChar(Cp)

Select Case Ch
Case "0" To "9", "a" To "f", "A" To "F"
Count = Count + 1
Mid$(Buffer, Count, 1) = Ch

Case Else
GoSub UngetChar
Exit Do
End Select
Loop

If Count = 0 Then Err.Raise vbObjectError + 13, , "Invalid literal"
ReadHexa = Left$(Buffer, Count)
Exit Function

UngetChar:
Pos = Seek(FileHandle)
Seek #FileHandle, Pos - 2
Return
End Function

You may have noticed that we are not dealing with negative numbers. We will see that when we get to operators.

The last literal is the harder one. Some examples of valid date literals: The last one is highly ambiguous. How can we tackle that?
Let's see next week.

Andrej Biasic
2020-07-22