Following our great fun last time implementing a brainfuck compiler and interpreter, I had to step up and start working on providing the community with a more civilized way of programming. Namely, functions. So, what’s a great yet simple language to request implementation for?
Did I ever mention I’m a sucker for esolangs?
The problem with functional esolangs is one of choice. Two excellent candidates stand out:
Unlambda: the first player on the block. Untyped lambda calculus, banning the use of lambda itself through extensive use of the SKI combinator set. Functional with all the bells and whistles you can dream of. Promises and continuations to boot, that you couldn’t even decently demand.
Lazy K: combines most of Unlambda, Iota and Jot in a unified syntax of pure functional awesomeness.
I know about Unlambda since about 2000… which, in retrospect, is not so much later after it was published. Ironically, I was already very familiar with the author’s website at the time, his quine page being a personal favorite; yet somehow I had missed that other outstanding aspect of his deed.
Lazy K, on the other hand, I’d only heard about some time around 2009 (and properly categorized within the esolang family), but hadn’t popped the lid open until now. And… it’s great! It makes all of the grand reasonable syntaxes available, and improves on the not-so-user-friendly aspects of Unlambda’s I/O protocol.
So how to pick?
CG puzzles are a peculiar beast. All sorts of audiences contend, with extreme variance in personal goals—and self-tolerated acceptable means of achieving them. This means some are going to attempt to claim their XP with the least possible effort. That effort wouldn’t be too high for languages with publicly available interpreters. So it would be a better2 puzzle if the required program wasn’t something that already exists. While still honoring the glorious legacy of its ancestor languages.
Lazy K has a great I/O concept. Very old-school Haskelly. I’d like to keep it. Unlambda has delays and continuations. I’d so like to keep them. Lazy K has Iota and Jot syntax, but I don’t care too much about these, and don’t think they’d add much to a puzzle. Unlambda has Turing tarpit output, and clunky input.
But those are code-writing considerations!
Writing an interpreter is a very different exercise. We don’t care so much about the I/O protocol being a nuisance in such a context: the programs are provided as test cases anyway! We do care that the implementation is challenging enough, so Unlambda would take a very serious lead indeed. Tarpit I/O becomes a desirable feature here. But we’d still like to raise the hurdle to copy-paste code from the interwebz and cheat through.
That’s where ICFP comes in. Without delving too much into this year’s contest’s minutiae, it did have the very welcome effect of reminding me there were alternative Turing-complete combinator sets.3 Namely, the BCKW system was provided, and we can make use of it here too.
The real fun part is: there’s no need to have any kind of deep understanding of either to convert code to BCKW instead of SKI, be it native Unlambda code or the interpreter. It can all be done4 by mechanical interpolation. So that’s what I do in the stream:
- implement a (reduced) Unlambda interpreter
- augment it with the BCKW combinators
- augment it with a SKI-to-BCKW transpiler
- convert the Unlambda code compendium to BCKW
- verify both versions behave the same
I tricked myself on that last step. All these past years of coding in purely functional Haskell have made me forget Unlambda isn’t pure at all! My mechanical SKI-to-BCKW conversion (possibly) worked just fine to return programs that computed the same result, but absolutely did not perform their I/O side effects in the same order!
Oops.
Well, given the cases I had, it didn’t need decompilation to fix, so that went quick enough. But I now have an additional item on my agenda: actually do my homework and understand a conversion from pure lambda calculus to BCKW. Before CGLambda v2 anyway.
I realize we’ve gotten this far without me even showing the language. How rude of me!
In its current state, a simple CGLambda program would look like this:
````````````HELLO_WORLDrv
You should immediately notice how much more clear this is than the equivalent brainfuck program. But I digress.
As mentioned above, Unlambda can be summed up as an untyped lambda calculus interpreter, where certain primitives happen to have I/O side effects when applied. It also happens to have eliminated lambda abstractions from its syntax, but that’s more of a status than a property. It features a nice and clean parenthesis-free5 syntax by providing a simple unique operator: `
(pronounced apply).
Where traditional lambda calculus would express function application as concatenation:
FG # this is F applied to G in at least *some* language
…Unlambda expresses it like this:
`FG
An Unlambda program is simply an expression, where an expression is either:
- the application of an expression to another:
`FG
- or a primitive
…and nothing else! Specifically, if F
and G
are valid Unlambda expressions, their concatenation FG
is not.
To evaluate `FG
, you’d:
- evaluate
F
- evaluate
G
- return the result of applying the first result to the second
This strange syntax can seem more complicated than useful until you consider associativity. Traditional lambda calculus is left-associative. So:
F G H = ((F G) H)
Computations that express as a simple comb of chained primitives are precious few. The average one is much more likely to be some tree of expressions, such as this one:
S(K(SII))(S(S(KS)K)(K(SII)))
= ((S(K((SI)I)))((S((S(KS))K))(K((SI)I))))
The first line is the associativity-compressed form; the second the fully unambiguous parenthesized one. In Unlambda, it’s written as:
``s`k``sii``s``s`ksk`k``sii
It’s only a bit shorter on this simple example, but you can imagine6 how much of a difference it makes on enterprise-scale applications. This syntax is a big win on the verbosity front. Plus, it’s lowercase, which everyone knows to be more readable7.
All I need to add for you to know absolutely all there is is the primitive list. Here it is for Unlambda 1:
k
: the K combinator from lambda calculus. Semantics:K=λxy.y
s
: the S combinator.S=λxyz.xz(yz)
i
: the I combinator.I=λx.x
v
: a “void” combinator.V=λx.V
c
: call with current continuation, aka call/cc. This one’s not from lambda calculus. It makes the programmer’s life easier. But who cares about the programmer? It makes the implementor’s life harder, which is a secondary goal here.d
: delay, aka promise.`dX
, as an exception to usual Unlambda semantics, does not evaluateX
. Not until its result is applied, that is:``dXY=`XY
, withY
evaluated beforeX
..X
: print. Value semantics are those ofi
; side effect when applied are to printX
. This isn’t really “a” primitive, more like a family of them, one for each character.8r
: newline. It’s just a synonym for the equivalent.X
primitive.
Our changes for CGLambda will be at the primitive level:
- base combinators
s
andi
are removed.b
, the B “composition” combinator is added.B=λxyz.x(yz)
x
, the C9 “swap” combinator is added.C=λxyz.xzy
k
is kept.w
, the W “duplication” combinator is added.W=λxy.xyy
- output functions
r
is kept as-is.- The two-character print primitives are renamed: any capital letter and the underscore are self-printing I combinators. Note that the dot is removed: they’re all one-character operators now. There is no way to print anything but capital letters, underscores and newlines.
- beginner friendliness
d
is omitted. For now.c
is omitted. For now.v
is kept, it isn’t so hard.
This makes for CGLambda level 1—let’s say medium difficulty on a CG scale. Implement an interpreter for that, pass the tests10 and you’re golden.
If this manages to pass moderation, CGLambda level 2 won’t be too far off:
c
andd
make a comeback!e
, the “exit” operator from Unlambda 2 is added. And we might start considering the program’s return value in addition to its output. With all due consideration to its well-definedness.- some input facility is added. I’m still on the fence as to which one. The Unlambda 2 variant, the {
@
,?x
,|
} set, is as clunky as ever, and introduces the two-character primitive problem again. Probably I’ll go with the Lazy K model of mapping input to output instead. If I can find a proper way to express it that happens to fit in the statement.
Good11 use of c
could really bring tears to users of those languages that are too imperative to have continuations, or to even conceive of them. I’m so excited.
If that passes, we’re getting closer to pipe dream territory, where I conjure enough time to write a CGLambda interpretation viewer, and we can have an optimization game about it. BCKW calculus is less explored than SKI, even less so with the syntax, semantics and protocol we’re giving it. I’m sure it can be very interesting. And trigger a lot of complaints.12
I’ve currently got a level 1 interpreter and a few “tutorial” tests cases. Careful proofreading and writing a statement are all it needs to be published as WIP. An independent implementation of an interpreter would then go a long way towards feeling confident about the statement and tests. Please volunteer.
Stream resumption is planned for this evening at the usual time. Stream closed. Notable changes: the level 1 language is called CGLambda Lite; the apply operator is now a slash instead of a backtick, for puny CG formatting considerations. 😞
Thanks to @dbdr and @Zorg1 for proofreading parts of this.
Not trimmed and edited yet, sorry! Not even complete yet. Later today?↩︎
…in my very personal view. But I’m the one writing the puzzle. 🧐↩︎
Don’t let my former teachers hear about this. I’ll deny ever forgetting their fine curriculum. 🙊↩︎
It will be pretty far from being optimal though. But that’s not a real hindrance for an interpreter implementation puzzle. A bit of the opposite, actually. 🙈↩︎
Lisp failed to make parentheses popular. It did incept uber-creative pseudo-rationalizations for that on a massive scale; it’s quite entertaining to behold. 😂↩︎
Or you could trust me. But would you trust me here that you should? Chuck on that. 🤔↩︎
I carbon-date this shift in readability perception for markup languages at the time of the transition from HTML3.2 to HTML4. Typesetters knew it all along. 😣↩︎
Unlambda doesn’t really specify a character set or encoding. The easiest interpretation given its setting in time would be to keep it simple and agnostically byte-oriented. Though it could just as easily be extended to “character- and encoding-agnostic”, for whatever character set suits our fancy. 😌↩︎
Not a typo.
c
is reserved forcall/cc
, so I’m calling this onex
for “exchange”. 😉↩︎I’ve pondered for a long time, and will opt to require the interpreter to stop output on its own after it reaches a certain threshold. It’s a bit dirty, but it makes a huge pan of *-lambda programs much easier to devise test cases for. It’s not too hard to implement. 🙃↩︎
“Good” meaning: where
longjmp()
won’t cut it. I hope. I’ll have to check its semantics are as primitive as I remember. 😈↩︎Which is a tertiary goal. 🙉↩︎