Tell us about your compiler!

mukadr · August 8, 2020, 3:38pm

Nice! I will continue with C# for now, but maybe a future rewrite in F# would be a good start.

mukadr · August 8, 2020, 3:40pm

Thanks for catching this! Thinking in how to fix it.

jdh30 · August 16, 2020, 6:30pm

I am using OCaml to write an ML compiler targeting armv8 on my Raspberry Pi 4 with stock 32-bit OS. I am documenting the experience and trying to maintain an entire lineage here.

My struggle is mostly finding time. My success is that my 161-line compiler can compile and run:

let sqr = fun x -> x*x in (fun f -> f(f 3)) sqr

keleshev · August 16, 2020, 8:39pm

@jdh30 looks like you’ve got a lot of interesting things going on: a custom calling convention, an interesting way of keeping track of the environment, and you avoid libc. Looking forward to see your next steps!

Tamiyocs · September 27, 2020, 8:31pm

Going to be using Rust to create a custom statically typed language. Been working through creating compilers that emit bytecode but I’m SUPER excited to be able to explore emitting raw assembly

jdh30 · September 29, 2021, 8:46am

Having put quite some effort into my hobby compiler I have come to the conclusion that my approach was entirely upside-down and the result was poor performance. Specifically I was using a stack-based compiler to compile a relatively sophisticated language into a relatively wide variety of assembly instructions. This resulted in far too many loads and stores and performance closer to ocamlc’s interpreted bytecode than ocamlopt’s native code. Specifically, my generated code was 5.3x slower than ocamlopt’s! This left me disappointed and frustrated: why bother generating native code that runs as slowly as interpreted bytecode?! Then I had a revelation…

Especially if you’re targeting the register-rich Arm architectures the initial focus should be restricted to:

Int constants
Calling conventions
Tail call elimination
Conditional execution
Register allocations

i.e. not arithmetic, loads, stores, strings, globals and so on.

Why? Because you can implement everything with just this by relegating all non-essential functionality to a C stdlib. Arithmetic operations become functions. Loads and stores become functions. Even stdin and stdout are obtained via function calls. You can call any C function!

Using this approach I have managed to implement a compiler for a general purpose programming language that generates just 7 different assembly instructions and the entire compiler is just 202 lines of OCaml code (+68 more for lexing and parsing) and the generated code is “only” 2.9x slower than ocamlopt.

What do you think?

I’ve written an echo program and Fibonacci function in my current source language. Next I’ll try implementing a more sophisticated front end that targets this minimal language.

keleshev · September 29, 2021, 11:17am

@jdh30 thanks for the interesting experience report! You original experiment is located in the following repo, right?

GitHub - jdh30/growing_a_compiler: Growing a compiler

Is the new compiler available publicly?

jdh30 · September 29, 2021, 11:47am

That’s the old one, yes. I haven’t made the new one public yet but I’d like to but, to be honest, I’m getting tired of all the tedium around VCS these days so I’m working on something better. Maybe I’ll put it up there.

Also, I’d like to bootstrap it…

jdh30 · October 2, 2021, 1:25pm

I forgot -O2 when compiling my C stdlib which actually makes a big difference: my compiler generates code that is 2.25x slower than ocamlopt on both Fibonacci and Hailstones benchmarks!

Also worth noting that those benchmarks are basically the best case scenario for OCaml at this point.

jsegarra · November 29, 2021, 1:15pm

Hi,

The two languages I have implemented:

A JavaScript interpreter: https://code.google.com/archive/p/sejscript implemented in Delphi, using bytecode.

A transpiler from NoSQL to SQL: http://jsegarra.net/nosqlonsql , implemented in C#.