Which language you are using to write it, and which language does it compile? What are your struggles? What are your successes?
I’m planning on compiling this: https://github.com/Apostolique/Vyne-Language. I’ll develop the language more as I go through the book. My original plan was to use C#, but maybe you’ll convince me to try TypeScript.
I’ve been meaning to work on the compiler for a while, but it’s hard to balance all my projects.
No successes yet, but the draft was interesting. I never considered ARM as a possible compilation target so this should be interesting.
@Apos Vyne looks interesting! I would recommend to use whatever language you’re most comfortable in. It’s challenging enough to write a compiler, and making one in a new language is even harder. The kind of TypeScript that is in the book should match well to C#.
I’m planning on using Golang which is a large part of my day job. I’m comfortable with it and the level of pedantry it requires seems just about right. I also use Python a lot, but I feel its dynamic nature will hinder me more than it will help.
To me the target language is not interesting (yet). I’m just looking forward to grasping the concepts and making the simplest thing that could possibly work.
@stn I don’t have a lot of experience with Go, but I think that code in the book should map well to Go. I haven’t used class inheritance (only interface inheritance) exactly for this reason: so that code is portable to languages like Go. Let us know how it goes!
It’s mostly going well. I started writing a bit of the parser combinators in C#, but I’m finding the generics to be pretty heavy. For example:
And I didn’t get to the interesting stuff yet. Currently looking for ways to simplify that. Also I might have written some of them wrong, I’ll have to test later.
I tested with TypeScript too, but I’m not sold on it yet. It feels too much like JavaScript which I guess is the point, but I have more fun with JavaScript.
Yeah, these type annotations are heavy. One thing that comes to mind, is to make the token definitions local to a function. This way, you can take advantage of local type inference. You could write:
var NOT = token(@"!").Map<Func<AST, Not>>(_ => term => new Not(term));
Instead of:
public static Parser<Func<AST, Not>> NOT = token(@"!").Map<Func<AST, Not>>(_ => term => new Not(term));
But that forces you to have all of the parser defined within a single function, I think.
I’m not an expert on C#, so maybe anyone has some better idea?
@Apos, by the way, nice idea with using @"\G"
and Match(string, int)
overload to solve the problem of matching at a specific position in a string.
I’m first writing the parser combinator utilities in a separate project, in C# (.NET Core): https://github.com/mukadr/ParseDotNet
The next part will be a compiler for some Pascal dialect, targeting ARM as explained in the book. I’m using my spare time to work on this project and study more about code generation. I would like to do register allocation.
I liked the approach used in the book for writing parsers. It is convenient to test each piece separately.
Welcome, @mukadr! Your parser combinators look good. It is interesting to compare with @Apos’ approach. It seems that you’ve decided not to use regular expressions, but instead have several sourse.Match
overloads for chars and strings and such? This should also work. It will also be interesting to see if you know a way to combat the verbose type annotations, like in @Apos case:
public static Parser<Func<AST, Not>> NOT = token(@"!").Map<Func<AST, Not>>(_ => term => new Not(term));
Yes, it matches chars and ZeroOrMore/OneOrMore concatenate them into a string. So far it seems enough for my needs.
One problem I had with the Parser class was with static methods.
Since Parser has a generic type T, every call to a static method is required to specify the type T.
For example (from @Apos):
public static Parser<string> Whitespace = Parser<string>.Regexp(@"[ \n\r\t]+");
Notice that Parser<string>.Regexp
could be written as Parser.Regexp
, since it does not depend on the type T.
To solve this issue in C#, one can move the static methods to a different class, like:
public static class ParserFactory
{
…
public static Parser<string> Regexp(string regexp, RegexOptions options = RegexOptions.None) {
…
}
Then you can get rid of the generics:
var parser = ParserFactory.Regexp(…);
There is also “using static" declaration to import all static methods from a class:
using static x.y.ParserFactory;
…
{
var parser = Regexp(…);
}
…
Unfortunately I could not find a way to simplify the class attributes, since var only works for local variables:
public static Parser<Func<AST, Not>> NOT = …
I’m writing my version of the compiler in Nim. It will hopefully compile your language (with a number of restictions) into the ARM assembly that you output. I currently have the AST portion completed as well as a lexer (using the regular expressions you had developed for this). I am currently working on the parser which should glue the other 2 modules.
I do love the fact that built the way you have it - I can easily test the 3 modules separately.
Looks like Nim has both variant types and inheritance with dynamic dispatch. Which one did you decide to use for the AST?
I used inheritance + dynamic dispatch with my AST. I think it leads to cleaner code.
If I did variants - there are some limitations that drive me nuts (I tried it on some other nim code of mine and found some limitations). I also think that variants means that you would perhaps have one emit procedure that looks at which variant it is and then go from there (so either one big case statement or have it call a specific procedure (which is basically the same thing as dynamic dispatch in the end)).
Moved the code to another repository: https://github.com/mukadr/SharpPascal
Doing the parsing in a single method, so I can use var for all declarations: https://github.com/mukadr/SharpPascal/blob/master/src/SharpPascal/Syntax/PascalParser.cs
Thanks to this “functional approach to writing parsers” I’m considering ​​learning F#
Yep, F# has good type inference and would be a good fit for this. It would look something like this, in your case:
module PascalParser =
let mutable currentLine = 1
let report message = sprintf "%d: %s" currentLine message
let whitespace =
OneOrMore(
Symbol(' ')
.Or(Symbol('\t'))
.Or(Symbol('\n').Bind(fun c ->
currentLine <- currentLine + 1;
Constant(c)
)).Or(Symbol('\r')))
...
Where module is similar to a static class.
By the way, nice that you track the current line number. It can be incorporated into the Source
class to make it transparent to the user of it.
@mukadr just thought that because of backtracking, the currentLine++;
might be executed erroneously multiple times on the same line.