Fitting a Forth in 512 bytes (2021)

(compilercrim.es)

91 points | by whereistimbo5 days ago

3 comments

Jun84 days ago
Forth has been sitting on my list of cool thing to learn when I have time for the past 20 yrs or so. What would be a compelling use case and setup?
- mananaysiempre4 days ago
  One compelling use case is getting a REPL on any microcontroller with a UART and, say, ≥64K RAM (I remember lbForth[1] being particularly portable, and there are other implementations you could use as well).
  I don’t know if you’d want to have that be your first experience with the Forth itself, though: there’s inherent fiddliness involved in bringing up hardware; the win is that Forth doesn’t really add any of its own once you’re vaguely familliar with the internals. If you can get it to boot and send and receive bytes, you can get an interactive Forth on it—or if the available resources don’t permit that, on an imaginary machine spanning it and your PC (a “tethered” Forth).
  [1] https://github.com/larsbrinkhoff/lbForth
  NonEUCitizen4 days ago
  I think you meant ≤ instead of ≥ 64K RAM
  mananaysiempre4 days ago
  I didn’t, no, but perhaps this merits a clarification.
  The native habitat of a traditional Forth is an 8-bit microcomputer, and compared to modern microcontrollers those had not a lot of compute but fairly abundant RAM (not to mention permanent rewritable storage). So to run a Forth organized along the usual lines, with a flat address space and a singular dictionary space and code loaded from textual blocks and so on, you do want 64K of RAM, I think. You could push that down to maybe 16K, with some limitations[1].
  But on a modern 8-bitter with 2K or 4K, you are going to need a system that can compile things offline, and then you can flash the resulting image and do your interactive work within a small in-RAM layer on top of that. That’s absolutely a thing people do, but it’s not what Starting Forth, Thinking Forth, and various other sources about Forth describe.
  [1] E.g. the Apple II fig-Forth from https://www.forth.org/fig-forth/contents.html requires the 48K RAM expansion and “provid[es] about 6K for user growth of the dictionary”.
- jlokier4 days ago
  I looked into tiny Forth and Lisp interpreters as the foundation for bootstrapping bigger things.
  I expected Forth to be better for this because it seems like a smaller language, minimal even, that would have the smallest interpreter.
  Forth feels lower level and simpler. After all, on the face of it, Forth is mostly a flat sequence of tokens and simple commands, run one after another liike assembly language.
  Whereas Lisp has obvious tree data structures up front in a prettier syntax, like higher level languages, and you're encouraged to use them. Even lexical scope, closures and macros if you want them. That seemed like it must be bigger and heavier.
  But I found the small Forth and Lisp interpreters came out about the same size.
  This is backed up by the existence of 512-byte boot sector implementations of both Forth and Lisp.
  So I decided for applications needing a tiny interpreter, or for bootstrapping, to not pursue Forth any more for those things, as a tiny Lisp does the job with (in my opinion) better ergonomics and versatility, and negligible cost difference.
  veltas2 days ago
  I'm a big Forth fan and I'd say lisp interpreters consistently come out smaller and better than their equivalent tiny Forths.
  I don't think it's a reason not to be interested in Forth, unless that was the draw.
- zabzonk4 days ago
  Some of the things I learned from implementing a FORTH on a CP/M box back in the early 1980s were
  - writing Z80 assembler
  - using the assembler and linker
  - getting a clue about how the CP/M file system worked (it didn't, very well)
  - writing a number of utilities (VT52 emulator, PacMan clone & stuff) in FORTH, which was fun
  - macro programming at compile/runtime using things like BUILDS/DOES
  It's a lot of fun and you don't need to invest much time in it to get things done.
- simne4 days ago
  Forth is known for extremely compact representation of code, and very portable, because standard have very few registers and work on stack machine. Unfortunately all these at cost of slow execution and not easy to make serious projects (hard to deal with large codebase chaos, but read more).
  So it definitely, platform for slow embedded applications, keyword Arduino, may be Raspberry.
  Sure, compactness is huge advantage in some other cases, for example, known boot loader shell for UEFI is written on Forth, so you could write applications for it (and yes, for this use case any modern motherboard except Apple).
  For large codebase works, some Forth people learn Scheme and technically switched to Lisp programming paradigm. Yes, it interest phenomena, Lisp techniques work with Forth good enough if developer disciplined enough. Sure, Lisp will open for you whole new world.
- astrobe_4 days ago
  Application scripting. Single file, low footprint, straightforward script/application interface. Can help with debugging.
  eternityforest3 days ago
  Lots of languages are very easy to embed, why not something more common and popular? Seems generally worth the effort to pick an easier to use language.
  Unless you know ahead of time your target audience probably likes it, or you're on a tiny embedded system.
  irq-14 days ago
  Is there a Forth made for scripting/embedding in an application?
  kragen3 days ago
  ATLAST https://www.fourmilab.ch/atlast/atlast.html is one, and PFE and GForth can also be used in that mode.
  stevekemp3 days ago
  A toy golang one you can embed, and a tutorial for how it was written here:
  https://github.com/skx/foth
  You can see it embedded to provide a turtle-like graphics thing in this repository:
  https://github.com/skx/turtle
  astrobe_4 days ago
  According to ChatGPT, there's GForth, SwiftForth, Forth-83, eForth and picoForth, but I cannot really tell how true it is. I use my own implementation, which I wish to publish someday but that's extra work for me to "anonymize" it. Probably the closest to it among ChatGPT's suggestion is eForth, but it's unfortunate IMHO they use C++.
  kragen3 days ago
  ChatGPT is, as usual, full of shit here.
  Forth-83 is not an implementation but a standard, superseded by the ANSI standard in 01994, though sometimes people confuse it with F83, which is Laxen and Perry's public-domain implementation of the Forth-83 standard. It came out in 01984. It runs on CP/M and MS-DOS, but the MS-DOS version is limited to a single 64-kibibyte segment. It has no facilities to support embedding it in a program written in another language.
  eForth, if by that you mean Bill Muench's eForth, is written in, mostly, Forth, on top of a small assembly-language core. It also has no facilities to support embedding it in a program written in another language.
  GForth is pretty usable and can be embedded, as documented in https://gforth.org/manual/Integrating-Gforth.html (which is down at the moment but will probably be back up soon along with forth-standard.org.)
  I don't know anything about SwiftForth and picoForth, but so far ChatGPT is 1 for 3, so I wouldn't bet much on those either.
  astrobe_2 days ago
  The eForth I commented on indeed derives from Muench and Dr Ting [1].
  About GForth, it doesn't look convincing - actually the feature looks like an afterthought (like those libraries that have been teared out of a program, e.g. libCurl).
  This use-case isn't well covered for Forth it seems, because the focus has always been mostly on the other embedding. That's why I'd recommend again the DIY route.
  [1] https://github.com/chochain/eforth
- kragen3 days ago
  I found that it was pretty fun for writing a minimal roguelike in: http://canonical.org/~kragen/sw/dev3/wmaze.fs (asciicast at https://asciinema.org/a/672405)
  In terms of compelling, though, I think it's mostly interesting as a study of how small and simple you can make a computing system—and what you give up when you do.
  Forth lets you get an eminently hackable REPL up and running in a couple of thousand lines of code and a few thousand bytes of memory.
- eternityforest3 days ago
  To me it seems like the one of the least useful language out there, but a lot of people love it.
  The one use case where it seems good is I think people were saying you could use it to write a compiler that you could bootstrap some other language in, because it's so small and presumably easy to audit at the assembly language level to avoid Trusting Trust problems.
  Everything else seems rather like suckless, it appeals to people who live simplicity for it's own sake, and want maximum control and understanding of the system, not so much to who want to support every file format ever made and handle every possible user error and hardware failure and last minute use case change, and reuse as much well known black box code as possible.
IAmLiterallyAB4 days ago
I have a side project to try to make an extreme minimal size compiler for a Forth inspired language and implementation. It's a compiler that generates a Forth like interpreter and byte code. The bytecode is huffman encoded. Nothing works yet but its an idea.
- 0823498723498724 days ago
  Have you seen ColorForth? Moore huffman encoded (or similar) his character set, so he could do dictionary lookup with just REPNZ SCAS.
- kragen3 days ago
  Are you planning to decompress the bytecode before execution? You might be interested in file `ingram-token-threading.md` in http://canonical.org/~kragen/sw/pavnotes2.git/, which describes a possible way to efficiently interpret Huffman-encoded bytecode on CPUs with wide registers.
dang4 days ago
Discussed at the time:
Fitting a FORTH in 512 bytes - https://news.ycombinator.com/item?id=27468698 - June 2021 (57 comments)