This is incredible. I'll be tinkering with this for a while, guaranteed. As the advantages of combining Lisp and Forth in this way are slowly revealed to me it's like unlocking parts of my brain to interact with each other that never have done before.
Pretentiousness on my part aside, this is a pretty mind-blowing concept. The interpreter being in less than 1000 lines of (comprehensible) C is all the more commendable (most of these minimal languages turn out to be some monolithic Rust thing or similar, which kinda defeats the entire purpose imo).
Excited to take a closer look at the source to see how the various data structures etc. are laid out in memory. I won't be able to resist making comparisons to my current/other favourite mini-language 'fe' (https://github.com/rxi/fe), a sub-1000-line Lisp also written in C. If you haven't seen that one I'd definitely recommend checking it out.
Thank you for putting this online. Happy hacking!
Very cool. I like that both lisp and forth were “discovered” and that this cvbp is more fundamental than both(?!). This reminds me of [pdf] https://dl.acm.org/doi/pdf/10.1145/181993.181999 and wonder if/how it relates.
Also thanks for sharing; looks interesting. I’ll read this when I have time and re-comment then. Off the bat: Forsp variables are definitely not linear.
Thank for sharing that paper. I am currently implementing a compiler that has very similar ideas to those presented in it. It's very different syntactically, but the mental model is quite similar. It's simultaneously encouraging and discouraging to see your ideas reflected back to you from 30 years in the past :)
Looks complicated to reason about, but really interesting. Freeforth implements a similar mechanism to eliminate swaps via register renaming, obv. the focus is different in that Freeforth tries to still be an optimised stack, rather than a VLIW processor. I guess you could look at this paper as a vector processor, similar to the way APLs handle things with function composition. In that, you'd need the compiler to take care of things as managing it manually would be a nightmare. I guess the best way forward would be a specialized (forth) word lib which invokes the compiler to manage the top N stack cells for the duration of the word.
what is cvbp? google yields nothing.
I think that was meant to be CBPV (Call-By-Push-Value)
It's a typo for CBPV, Call-By-Push-Value.
What I like about this is how neatly it reduces everything to a small set of concepts. Much like lisp and forth, but not the same as either. It's exciting to think that there may be more of these out there, waiting to be discovered!
I been wanting to make a forth+lisp language for a while now - inspired by HP's RPL (its so close to being a lisp!).
Super cool
A friend mentioned RPL, but that didn’t seem to even have first-class functions… unless I’m mistaken?
Im not a CS guy, so I dont know what "first class" means.
I can do whatever to a function. I can use it as an argument. I can return it from another function. I can make a function that modifies functions.
All that w/out doing anything out of the ordinary day to day programing.
You can put functions on the stack.
with arrow syntax even
<< a -> << a a * >> >> 'SQUARE' STO
I still can't believe how advance RPL calculators wereThe HP engineers were good.
First HP engineers made RPL, then they attended Sussman and Abelson's intro to CS class. I think Sussman once said that, if RPL had anonymous functions it'd be a perfect lisp.
They were real good.
https://ocw.mit.edu/courses/6-001-structure-and-interpretati...
Reading the example, I really wish the syntax for push and pop had been "<foo" (push foo on stack) and ">bar" (pop from stack into bar). I find the choice of $ and ^ not obvious. Especially since, for me, ^ implies popping, not pushing.
You can fork and fix that with sed. Haha.
I decided on using sigils and then my thought-process drifted to Perl/Php, so $ naturally felt like a variable definition (which naturally must pop from the stack). I guess ^ is “push ‘up’ to stack”. But it’s all really arbitrary irrelevant syntax. YMMV.
'$' is common for interpolation in many languages. It's a good choice.
Awesome language, and it's definitely important from a theoretical point of view. However, you've missed the chance to call it "LiFo" :)
Interesting. I'm not sure about $ and ^ either but I think I'd keep getting confused about which one of < and > meant pushing vs pulling
Maybe + for push and - for pop?
This looks really neat, although I'm still wrapping my head around it! It's funny how things that are even more fundamentally simple than what has been discovered, can be (initially) harder to reason about!
So you could presumably also write a Lisp interpreter AND a Forth interpreter using it?
(Might as well write a Turing machine interpreter too for the trifecta... assuming one can decide on the syntax...)
That plus some additional functionality to make it more usable in the general sense (adding math functions, string manipulation, maybe some basic I/O) and it might be a VERY interesting instructional tool.
I've heard the term "thunk" but I forget what it means...
A thunk is a piece of glue machine code that adjusts some pointer (which is imagined to make a "thunk" sound) and then calls the real code.
It's a delayed evaluation, as I understand it?
That's the way I tend to read it. The fun bit then is applying context at eval time.
Rebol allows you to use a block as a list/function/thunk, which is lazily evaluated. This changes the scoping in that you BIND a context to this thunk.
It's a different take on process migration in that the context can be serialised with the thunk, or a new context applied at the process site. This is in contrast to lexically scoped lambda functions which then need to account for dynamic (free) variables somehow.
There is a usage of thunk in the context of implementing a compilation strategy for call-by-name arguments.
The usage survives into modern times, whereby lambda arguments are sometimes called thunks.
A thunk is just a function with no arguments representing a computation. In C++:
int a=1, b=2; auto thunk = [=]() { return a+b; }
so a deferred evaluation, like (in elixir) wrapping a computation in an fn (lambda of 0 arity)
What has Lisp semantics and a stack with "call by push value" and whatnot is any one of the stack-based virtual machines used for compiling Lisp over the past 60 years.
Another recent attempt at a language combining features of Forth and Lisp https://github.com/rickardnorlander/misc/tree/main/cursed_la...
Someone even asked the same question
' seems redundant. 'x could just be (x)?
though for that language the answer was that yes it's exactly the same.Cool language. One modest suggestion: perhaps return 0 rather than 1. Feature request: addition and division as well as subtraction and multiplication ;)
Addition can be implemented as:
(0 swap - -) $+
Division is much harder, but doable also. Generally Forsp in the camp of minimalist languages so minimizing the number of primitives seemed more interesting.
If you want to play around yourself, it’s trivial to add those operations.
> Division is much harder, but doable also.
> If you want to play around yourself, it’s trivial to add those operations.
Now you're using "trivial" like a mathematics professor!
Ha. Well I meant “add those operations to the C interpreter”, which would be like 4 lines of code.
Reminds me of the language from this paper, which also shows how to smoothly integrate I/O and state into such a style https://arxiv.org/abs/2212.08177
Forth+Lisp = Wet dreams of every programming nerd
Until you have to deliver something and you realize you love programming languages more than creating software.
nothing wrong with enjoying playing with your guitar more than actually making music.
Nothing wrong with it until it's time to make music and you need a normal guitar.
If you have time, could you explain the ycombinator and if parts a bit more? Been a while since I looked at this stuff
I like this very much!
pop is kinda like ! in FORTH and push is like @ so I think $foo would be more FORTHy as !foo and ^foo as @foo
If you like to work the other way I present https://github.com/garlic0x1/forth forth in common lisp.
Wow he even provides an interpreter in C. How would a compiler implementation for this language foreseeably differ from the interpreter given, does anyone know? Trying to learn about this more.
I'd take a leaf out of Freeforth's compiler (https://github.com/dan4thewin/FreeForth2): It implements a repl via single-pass (as with all forths) compilation of an anonymous block. Scoping in forth is dynamic ie. what's on the stack, and lexical in the dictionary ie. new words can shadow old ones.
Stacks have great GC properties vs. an actual GC, but for more complex scenarios, and dictionary management, some sort of GC would be preferred. I've not looked at the forsp source yet, but it may need a more general (realtime) GC, possibly via a linear-types system.
I feel like simple reference counting + linear types would be a really good approach in this case.
Could ^a be dispensed with? It looks like (a) is equivalent to it.
Subtly different. Originally I didn’t have ^a, but you do in-fact need it. ^a means “push with no evaluation” and (a) means “build a closure” that computes a”.
If you push, pop, push, pop, push repeatedly you’ll be wrapping the value in lots of closures. The value is no longer a value too. You’d have to “force” it to get back the value.
Semantically quite different. If you’re going all Turing tarpit, could you survive without “^a”? Maybe. But you’d be suffering for sure.
Operationally, you don't have to do the wrapping: since (a) is equivalent to ^a, you can substitute the implementation of the latter when you see the former.
Though if `(a)` is observably "no longer a value" of course then that's an issue. Which may or may not be a problem :-)
I thought of that. But then you’re making the semantics complicated. Say you really did want a (a) closure. Now you can’t have one, and you have to resort to more tricky wrapping. Not worth it.
I think the equivalent would be ('a) instead as (a) would resolve to "'result of application of a' push".
The page says that "`(foo bar) force` is the same computation as `foo bar`" and that "`^a force` is the same computation as `a`". So then `(a) force` should be the same computation as `^a force`.
Since it is written in C, we need to write COBOL and FORTRAN interpreters with it, make sure the manpages are in Latin, and we'll hit the ancient languages nerdvonia.
User viewable docs should be Anglo-Saxon and dev docs should be Latin (or Attic Greek?) Publicity materials should be medieval Arabic, and the secret documentation that isn't shared should be in Mandarin.
I really like the syntax and ergonomics - nice job @xorvoid.
Very nice! Thanks a lot for the discovery.
I've never really got Forth. It's good for sending commands to a printer, but actually programming in it seems like a drag. I get that it's technically the same as lisp code, just backwards without the parens, but adding the parens just makes it easier to understand. Adding variables is smart and would make it more tolerable but I would still rather work in lisp.
I guess you could say that syntactically it's like lisp, only backwards and without the parens. But the deeper truth is that concatenative programming and procedural programming (or functional, I suppose, depending on your preferred lisp flavor) are very different paradigms that lead to fundamental differences in how you structure your code.
I haven't used Forth in anger for over two decades now, but in some ways I do miss how the language felt even more hackable than a good lisp, while simultaneously being easy to reason about at a low level. That said, the RPN and stack manipulation definitely make it a bit of a brain teaser, especially at first. In that sense it reminds me of Prolog and APL and, for that matter, Lisp: they're languages where the greatest joy of learning them is the way they force you to think in very different ways, and how that can promote epiphanies about computation.
I would need to spend more time grokking this language to really understand, but it appears that the author has effectively achieved something that's semantically equivalent to Forth, but has a lot of syntactic commonality with Lisp. That's rather compelling to me from a "eliminates the first and most noticeable stumbling block for newcomers" perspective. That said, I'm having to repress a bit of a desire to react the same way parts of the Racket community did when Matthew Flatt announced his infix syntax project. Is a key part of Forth's concatenative nature lost when the syntax itself is no longer concatenative? I don't know, and I'll admit it's very doubtful that I'll spend enough time with Forsp to find out. But it's an interesting thought, all the same.
Here's why it's interesting: https://termbin.com/blz3
There you go, now it has parens, just backwards. Can even use parens instead of braces if you want, I just didn't want to overwrite comment syntax.
I copy your code below, because it's so terse and marvellous and I don't want to deprive anyone who assumes otherwise and doesn't click on your link:
: >BRACES 1 CELLS BRACES +! BRACES @ ! ;
: BRACES> BRACES @ @ -1 CELLS BRACES +! ;
: { DEPTH >BRACES ;
: } DEPTH BRACES> 1+ <> ABORT"unbalanced" ;
And the example "backwards lisp" expressions you give as test cases: { 1 2 + } .
{ 5 { 1 2 + } * }
I don't find that particularly interesting. My entire point was that it has essentially the same semantics as lisp, and this just demonstrates that. Also, I don't know a lot about Forth, but it would seem to be limited to 30 levels of brace nesting, even between functions. Unless I am mistaken, it requires a pointless increment and decrement of a variable before and after every function call. I also don't know if it would work in a threaded context. Aren't those just functions? If so, it wouldn't work in compiled code.
> it would seem to be limited to 30 levels of brace nesting
Yes because I declared 30 cells, you can declare more if you want.
> Unless I am mistaken, it requires a pointless increment and decrement of a variable before and after every function call.
Yes it adds overhead, you can rebuild with : { ; IMMEDIATE : } IMMEDIATE ; and then there's zero overhead on any Forth, when you've demonstrated the braces match. This is meant as an aid for people that struggle without parens.
> Aren't those just functions? If so, it wouldn't work in compiled code.
It works in compiled and interpreted code, 'just functions' compile automatically without trouble in Forth, it's parsing words and immediate words that need care.
No that code wouldn't work with multiple threads, but if you declare the space as USER space it would, similar to just whacking 'thread_local' on something.
The example is failing in one regard, that it doesn't provide a way to reset the stack if something goes wrong. You can use something like : RESET{} BRACES BRACES ! ; to accomplish that. But it is just meant to show that Forth is quite succinct and powerful, which it does.
> Does the call to DEPTH still work in that case?
Code using the braces can use those definitions when you want to remove the overhead, and it disables the feature. This lets you use it as a programming aid, and then remove the overhead completely when it's not necessary or for a 'release build'.
> It's simultaneously more readable and more terse than the Forth code.
Does it still look neater when you use the syntax you've defined?
> And this example doesn't stop working after a certain number of open braces.
If you really want to remove the limit you can, Forth is just explicit about memory management. This is like C vs JavaScript. I'm quite happy playing with memory management myself, but if you don't like that or think declaring a fixed size stack is hacky then I can see why Forth wouldn't be appealing.
Here is one way to remove the limit, it has drawbacks (but so do all dynamic memory approaches, they're just hidden by managed languages).
: POP HERE -1 CELLS + @ -1 CELLS ALLOT ;
: { DEPTH , ;
: } DEPTH POP 1+ <> ABORT" Unbalanced" ;
> Saying that Forth is powerful just because you can use it to write a compiler for a more powerful language is silly.Yes but in a few lines of simple code, and the result is actually a syntax that looks nice to use.
Forths do have local variables. (Cue debates on what makes a stack-based Forth-like language a "Forth".) Here is the description of Gforth's support for its own extended locals and the Standard Forth locals: https://gforth.org/manual/Locals.html (Cue debates on whether Standard Forth is a Forth and whether it is standard.)
When I learned Forth about 15 years ago, we used the { } syntax described here, not the newer(?) {: :} syntax: https://gforth.org/manual/Local-Variables-Tutorial.html
Forth is awesome in situations where you've got a small environment, with crap tools, and you need to do some iterative development. In a pinch, you could hand assemble a Forth kernel and just burn it on an EEPROM and away you go. There are usually nicer alternatives today, but when you had to build your tools up from nothing, it was a very reasonable option.
Correct. You can build a Forth compiler from hand-typed hex machine code. It will be a bit easier if you have an assembler handy, but that's not essential. Trying to build a Lisp compiler this way would be madness.
The comparison between Lisp and Forth that is often made is very shallow. It's more than just prefix vs postfix. Lisp takes garbage collection for granted, Forth is designed to avoid any complication and doesn't even have heap allocation.
The lack of local variables and named parameter is also a consequence of this search for ultra-low complexity, in dialects that follow the traditional style (the one of the creator of Forth, Chuck Moore).
The readability argument is one that should always be taken 1024 grains of salt. You say Lisp is more readable, yet you'll find some people to say that it's "Lots of Insipid and Stupid Parenthesis" and that prefix notation is unnatural. You could also find positive opinions about APL. It is just a matter of training and habits.
Lisp is objectively more readable. Adding parentheses around expressions distinguishes them. Compare the following.
a b c d
(d (c b) a)
The bottom example is objectively more readable. You can determine more about its intended function by looking at the code. In the top case, you can't tell what is a function call or how many arguments each function consumes. In the bottom, you can.> objectively more readable
Certainly not "objectively" in the literal sense, because there's no measure of "readability". But you can define one if you want, so that we can argue about its biases and relevance ;-)
But Forth programmers sometimes use double spaces to separate sub-expressions:
(d (c b) a) <=> a b c d
... or something like that. But in reality what one writes is (hopefully this is close enough to your abstract example): (square-root (+ (square a) (square b)))
<=>
a square b square + square-root
Or if we stick to real-forthers-do-not-use-local variables: square swap square + square-root
Newbies will probably use "training wheels comments": ( a ) square swap ( b ) square + square-root
I know the usual objections that one needs to keep the number of inputs/outputs of each word/function. It's actually not as difficult as it seems with good naming and when you keep things simple. Actually, the former is a well-known hard problem and the latter is a perhaps lesser-known hard problem. But I would say that's good problems to solve.> Something is harder to read if it takes more effort to read it.
I think the smiley was a sufficient warning not to go there. Define harder. For whom ? etc.
> hold that information on a stack in your head.
I have proactively addressed that point, it is a matter of coding style and "good practices". But since you can only believe me on that, the discussion can only go in circles if you don't agree.
> It's just transparently obvious that reading Forth code takes more effort than reading lisp code for anyone who has spent equal time doing both.
It depends by whom this code was written. I can certainly craft Lisp code which is a nightmare to read. I've seen many times Forth code that was not designed and written in the best style. "Readability" is a relationship between the reader and the writer, it's something that becomes clear when you do code reviews. I think people would generally agree that C++ is "more readable" than both Forth and Lisp, yet if you use lesser-known C++ features - or just omit parens around expressions because you assume that the priorities are obvious - you can easily lose intermediate level reviewers.
Anyway, after reading many online threads about "readability", I came to the conclusion that it is a non-argument, and that discussions about it are most often pointless. But I wanted to use that to show how Forth solves this apparently huge problem, and introduce a bit of its philosophy.
Yes, Forth is point-free at its core, the only named variables that exist are global variables. The standard defines local variables as an optional extension.
The coolest thing about Forth to me wasn't Forth itself but that it was a part of the OpenFirmware standard: https://standards.ieee.org/ieee/1275/1932/ I never understood why that standard just died and wasn't adopted for something like ARM SBCs where the built in debugging ability would come in useful.
Me too. For some reason, Lisp is kinda easier for me. And if I need to go low level: assembly.
I read Forth was popular among embedded devs during 80/90s. That's not my field, though.
Forth was the sweetest thing in fixed length machines 60 years ago. When everything is one word, you can assign unique symbol for that word and keep those symbols in separate file (aka paper tape). Some Forth primitives were also machine primitives, like INC [SP] or "++", increase the number at top of stack.
There are Forth like languages, such as Kitten, that have variables and more.
Do they have any compelling reason why "List<T>" isn't just "T List"? That's how they write it in OCaml. Then comma separated lists, are they mad? I much prefer how Uiua does this.
Hmm, really nice! I might have to steal all of this for my language (https://github.com/alexisread/minim/blob/develop/minim/minim... really in flux ATM)
Reminds me of https://pygmy.utoh.org/3ins4th.html except the third instruction is execute rather than quote.
In the spirit of building from scratch, I'd like to highlight sectorforth/milliforth (https://github.com/fuzzballcat/milliForth/blob/master/sector...) - they implement as little as possible to get a working system ie. just the basic fetch/store ops, numerical and I/O.
The parser can probably be reduced - there's a nice paper detailing Cognition (a forth dialect, but the parsing could be broken out into a lib - https://ret2pop.nullring.xyz/blog/cognition.html) where using a couple of crank operators, you can implement custom syntax in a forth-style way (as opposed to say PEGs).
In terms of variables and scoping, Dreams (another forth dialect- http://elilabs.com/~rj/dreams/dreams-rep.html) uses mianly dynamic scoping, and builds it's structs (C-style structs) using the standard forth struct words. This allows it to be hard-realtime though as you don't need to walk the lexical tree to bind variables. You can also early bind like with CBPV. I guess this is also similar to REBOL's BINDology (https://github.com/r3n/rebol-wiki/wiki/Bindology)
Lots of overlap here with these things. Just for completeness there's also dynamic variable handling (over lexically scoped structures) with wat (https://github.com/GiacomoCau/wat-js/tree/master) though this is not realtime.
Is it possible to make the closures dynamically scoped by default here? I've not had time to think about this properly. I like the fact that eval is lazy here like in forth or REBOL, rather than eager as in lisp - the idea of passing around blocks/thunks appeals wrt realtime (well with dynamic scoping) and parallelism.
The env per-thread I guess could be compared to the forth dictionary? Dreams abstracts this by virtue of it's (token) threading model which appears useful for task switching, objects and the like.
I'd also like to highlight able forth which has: Compiler-only design (like Freeforth) - Word-classes (like Retro) - A straightforward bootstrap process using a minimal set of seed words (like seedForth) - No (ZERO) special forms, not even integers, and a consistent model for adding literals without the complexity of i.e. recognises - A single flat word-list that can be freely manipulated and composed of other word-lists - No separate [assembly] code words - Explicit optimizations (ONLY) i.e. explicit tail-call elimination
I've only started looking into able forth so can't really comment on it much, but the no-special-forms appeals here.
Sorry, random thoughts here, I'd like to discuss further when I've had time to digest your language. :)
Happy to discuss. Email address can be found on my website.
> It's a hybrid language combining Forth and Lisp, so naturally it's called Forsp
Shame. Missed a golden opportunity to call it "Lithp" :)
Someone could still call the libre implementation of Forsp Lithp so not all is lost.
Nice, anyone using this? Have practical feedback?
Does this have potential to grow, or would most people just say 'use lisp'.
I only ask because there are hundreds of cool niche languages now. And, even thought this looks cool, time is time, and I'm just wondering if is worth committing any effort to learning/using/playing with.
Is there any feeling that this has legs?
Thanks for the kind words. But if you’re looking for fancy data-structures, you won’t find them. Haha. It’s just a single object type (sum-type/tagged-union) and lots of cons cells. In other words, just a minimalist Lisp.
I am in love with 'fe', as well as the slightly-better IMHO 'aria' by the same author.