notes

20 Years of Whitespace: Past and Future

by Thalia Archibald, 1 April 2023

Today marks the 20th anniversary of the public release of the Whitespace programming language. What started as a gimmick, to encode a language with invisible syntax, has inspired over 350 projects and become a recurring curiosity in the esoteric programming language community.

What is Whitespace?

Whitespace is a small, stack-oriented programming language, with a completely invisible syntax.

Syntax

The only tokens are space, tab, and line feed characters—anything else is a comment and ignored. Instructions are represented as a sequence of those tokens, so parsing follows a prefix tree:

Prefix tree of instructions

It is not an optimal encoding like a Huffman coding, but it does assign shorter opcodes to frequent instructions like push, while leaving room for potential extension.

Number literals start with the sign (space for positive or tab for negative), followed by the digits in binary as a sequence of spaces (representing 0) and tabs (representing 1), terminated with a line feed. Labels are encoded similarly, but without a sign.

Instructions

Whitespace has only 24 instructions.

More details can be found in the official language tutorial.

Semantics

TODO

Example

For example, this program prints the numbers 1 through 10:

   	

   	    		
 
 	
 	   	 	 
	
     	
	    
    	 		
	  	
	  	   	 	

 
 	    		

   	   	 	
 




It’s not very readable, so an assembly-like syntax is often used:

    push 1   # SSSTL
C:           # LSSSTSSSSTTL
    dup      # SLS
    printi   # TLST
    push 10  # SSSTSTSL
    printc   # TLSS
    push 1   # SSSTL
    add      # TSSS
    dup      # SLS
    push 11  # SSSTSTTL
    sub      # TSST
    jz E     # LTSSTSSSTSTL
    jmp C    # LSLSTSSSSTTL
E:           # LSSSTSSSTSTL
    drop     # SLL
    end      # LLL

History

TODO

About me

I’ve done a lot with Whitespace. When I started, it was a simple procrastination project during finals week to build an interpreter in C++, but it quickly grew in ambition. I learned LLVM, so I could build a Whitespace compiler in Go targeting LLVM IR. Then I built a Whitespace interactive debugger, to learn jq in depth. It was the first project I built in Rust. I tried making a Whitespace interpreter in Brainfuck (spoiler alert: it’s difficult). I formalized it in Coq with small-step semantics. Most recently, over the weekend at a hackathon, I built an optimizing compiler middle-end in 24 hours, that models Whitespace’s lazy semantics, eliminates the stack within basic blocks, and performs common-subexpression elimination.

After learning enough assembly to feel powerful, during finals week, I decided to build a Whitespace interpreter, since it then felt approachable. That procrastination project led to numerous interpreters and compilers for Whitespace, including a Coq formalization, a Go AOT optimizing compiler, an interpreter in Rust that models the intentional lazy semantics from Haskell, a debugger in jq, and others.

It activated a latent interest in PL theory. I credit it to solidifying my interest in PL theory and shifting my studies to that.

I’m applying for grad schools soon. Are there programs in Germany, the US, Switzerland, Austria, or the Netherlands, that do research in programming languages? Be more specific