Reading list by date
Papers I have read and conference talks I have watched for research and
projects.
2025
2024
- 2024-12-31:
“Historical Oddities & Persistent Itches”,
James Gosling, JVM Language Summit 2024
- 2024-12-28:
“Machine-Assisted Proof”,
Terence Tao, 2024
- 2024-12-27:
Nix Pills,
Luca Bruno, 2014–2015
- 2024-12-16:
“HOIST: A System for Automatically Deriving Static Analyzers for Embedded Systems”,
John Regehr and Alastair Reid, 2004
- 2024-12-13:
“Word-initial rhotic avoidance: a typological survey”,
Laurence Labrune, 2021
- 2024-12-09:
“wevaling the wasms: AOT JS Compilation (Or: Stuffing a Dynamic Language onto
a Very Static Platform)”,
Chris Fallin, 2024
- 2024-10-30:
“Reduction of OBDDs in linear time”,
Detlef Sieling and Ingo Wegener, 1993
- 2024-10-29:
“Graph-Based Algorithms for Boolean Function Manipulation”,
Randal E. Bryant, 1986
- 2024-10-24:
“Relational E-matching”,
Yihong Zhang, Yisu Remy Wang, Max Willsey, and Zachary Tatlock, POPL 2022
[preprint]
- 2024-10-24:
“Z3: An Efficient SMT Solver”,
Leonardo de Moura and Nikolaj Bjørner, TACAS 2008
- 2024-10-24:
“Efficient E-matching for SMT Solvers”,
Leonardo de Moura and Nikolaj Bjørner, CADE 2007
- 2024-10-23:
“Designing a Fast, Efficient, Cache-friendly Hash Table, Step by Step”,
Matt Kulukundis, CppCon 2017
- 2024-10-17:
“Better Together: Unifying Datalog and Equality Saturation”,
Yihong Zhang, Remy Wang, Oliver Flatt, David Cao, Philip Zucker, Eli
Rosenthal, Zachary Tatlock, and Max Willsey, PLDI 2023
- 2024-10-15:
“egg: Fast and Extensible Equality Saturation”,
Max Willsey, Chandrakana Nandi, Yisu Remy Wang, Oliver Flatt, Zachary
Tatlock, and Pavel Panchekha, POPL 2021
[arXiv]
- 2024-10-12:
“Egraphs and Automated Reasoning: Looking Back to Look Forward”,
Philip Zucker, PLDI 2024
- 2024-10-12:
“EGSTRA: E-Graph-Based Structures for Test Suite Reduction and Abstraction”,
Sabrina Reis, Matthew Sottile, PLDI 2024
- 2024-10-12:
“Slotted E-Graphs”,
Rudi Schneider, Thomas Kœhler, Michel Steuwer, PLDI 2024
- 2024-09-24:
“Understanding Graal IR”,
Chris Seaton, VMIL 2020
- 2024-09-18:
“Flow Diagrams, Turing Machines And Languages With Only Two Formation Rules”,
Corrado Böhm and Giuseppe Jacopini, CACM 1966
- 2024-09-07:
“Binary Search a Little Simpler & More Generic”,
Jules Jacobs, 2020
[follow-up]
- 2024-08-07:
“How Can I Academia When My Brain Can’t Even? Mental Health in Grad School
and Beyond”,
Kenny Foner, PLMW 2019 [publications]
- 2024-07-26:
“
cjq
: A Compiler for the jq Programming Language”
John Rubio, 2024, M.S. Thesis
- 2024-07-17:
“DRAFT: The UNIX Time-Sharing System”,
Dennis M. Ritchie, 1971
- 2024-07-14:
“Memory Requirements in a Telephone Exchange”,
Claude E. Shannon, 1950 [HN]
- 2024-07-05:
“A logical calculus of the ideas immanent in nervous activity”,
Warren S. McCulloch and Walter Pitts, Bulletin of Mathematical Biophysics,
1943
- 2024-07-05:
“REC language is a live on IBM1130 simulator”,
Ignacio Vega-Paez, Jose Angel Ortega, and Georgina G. Pulido, 2009
- 2024-07-05:
“LIDA/REC Visual Language for Databases interface PostgreSQL”,
A. Hernández-Montoya and S. V. Chapa-Vergara, 2005
- 2024-07-05:
“REC and Convert as aids in teaching Automata Theory”,
Gerardo Cisneros and Harold V. McIntosh, 1989
- 2024-07-04:
“A Synthesizing Superoptimizer”
(Raimondas Sasnauskas, Yang Chen, Peter Collingbourne, Jeroen Ketema, Gratian
Lup, Jubi Taneja, and John Regehr, 2018)
- 2024-07-03:
“Tree-sitter - a new parsing system for programming tools”
(Max Brunsfeld, 2018)
- 2024-06-16:
“Manifest V3 Unveiled: Navigating the New Era of Browser Extensions”
(Nikolaos Pantelaios, Alexandros Kapravelos, 2024)
- 2024-05-28:
“Surveilling the Masses with Wi-Fi-Based Positioning Systems”
(Erik Rye and Dave Levin, IEEE S&P 2024)
- 2024-05-20:
“GxHash: A High-Throughput, Non-Cryptographic Hashing Algorithm Leveraging
Modern CPU Capabilities”
(Olivier Giniaux, 2023)
- 2024-05-18:
“A Differential Meet-in-the-Middle Attack on the Zip cipher“
(Mike Stay, DEF CON 2020) [paper]
[code] [news]
- 2024-05-18:
“Incremental Computation with Adapton”
(Matthew Hammer, 2015)
- 2024-05-11:
“Light years ahead: How the Apollo Guidance Computer pioneered an era of
reliable software”
(Robert Willis, 2019)
- 2024-05-08:
RFC 9535 “JSONPath: Query Expressions for JSON”
(Stefan Gössner, Glyn Norminton, and Carsten Bormann, 2024)
- 2024-05-03:
RFC 9485 “I-Regexp: An Interoperable Regular Expression Format”
(Carsten Bormann and Tim Bray, 2023)
- 2024-04-24:
“A Correspondence between Continuation Passing Style and Static Single
Assignment Form”
(Richard A. Kelsey, 1995)
- 2024-04-23:
“Having it both ways: Larry Wall, Perl and the technology and culture of the
early web”
(Michael Stevenson, 2018)
- 2024-04-22:
The UNIX-HATERS Handbook
(Simson Garfinkel, Daniel Weise, and Steven Strassmann, 1994)
[HN]
- 2024-04-21:
“A Regular Expression Matcher”
(Rob Pike and Brian Kernighan, 2007)
- 2024-04-12:
“On the Feasibility of Stealthily Introducing Vulnerabilities in Open-Source
Software via Hypocrite Commits”
(Qiushi Wu and Kangjie Lu, 2020)
- 2024-04-08:
“SSA is Functional Programming”
(Andrew W. Appel, 1998)
- 2024-04-08:
“The future vision of Ruby Parser”
(Yuichiro Kaneko, 2023)
- 2024-04-07:
“Regular Expression Search Algorithm”
(Ken Thompson, 1968)
- 2024-04-07:
“Tracing Back the History of Commits in Low-tech Reviewing Environments: A
case study of the Linux kernel”
(Yujuan Jiang, Bram Adams, Foutse Khomh, and Daniel M. Germán,
ESEM 2014)
- 2024-04-07:
“A Dataset of the Activity of the
git
Super-repository of Linux in 2012”
(Daniel M. Germán, Bram Adams, and Ahmed E. Hassan, MSR 2015)
- 2024-04-06:
“Why Programming Languages Matter”
(Andrew Black, 2023)
- 2024-04-06:
“Translation Validation for a Verified OS Kernel”
(Thomas Sewell, Magnus Myreen, and Gerwin Klein, PLDI 2013)
- 2024-04-05:
“Magic: The Gathering is Turing Complete”
(Alex Churchill, Stella Biderman, and Austin Herrick, 2019)
- 2024-04-04:
“Impact of Economics on Compiler Optimization”
(Arch D. Robison, 2001), discussed in a thread
by John Regehr
- 2024-03-12:
“Formal specification of the jq language”
(Michael Färber, 2024)
- 2024-03-16:
“if … then … else had to be invented!”
(Erica Fischer, 2019)
[video]
- 2024-03-01:
“Colored E-Graph: Equality Reasoning with Conditions”
(Eytan Singher and Shachar Itzhaky, 2023)
- 2024-03-21:
“Guided Equality Saturation”
(Thomas Kœhler, Andrés Goens, Siddharth Bhat, Tobias Grosser, Phil Trinder,
and Michel Steuwer, POPL 2024)
- 2024-03-02:
“Merge-Tree: Visualizing the integration of commits into Linux”
(Evan Wilde and Daniel M. Germán, VISSOFT 2016)
2023
- 2023-11-16:
“Rete: A Fast Algorithm for the Many Pattern/Many Object Pattern Match Problem”
(Charles L. Forgy, 1982)
- 2023-10-31:
“LLVM: An Infrastructure for Multi-Stage Optimization”
(Chris Lattner, 2002)
- 2023-10-27:
“Secrets of the Glasgow Haskell Compiler inliner”
(Simon Peyton Jones and Simon Marlow, 2002)
- 2023-10-18:
“Partial Evaluation of Computation Process, Revisited”
(Yoshihiko Futamura, 1999)
- 2023-10-15:
“On the Expressiveness of Purely Functional I/O Systems”
(Paul Hudak and Raman Sundaresh, 1989)
- 2023-10-15:
“Arbitrary precision arithmetic using continued fractions”
(Simon Peyton Jones, 1984)
- 2023-10-09:
“The algorithm for precision medicine”
(Matt Might, ICFP 2023)
[HN]
- 2023-10-06:
“Parsing distfix operators”
(Simon Peyton Jones, 1986)
- 2023-09-24:
“Memory-safe Execution of C on a Java VM”
(Matthias Grimmer, Roland Schatz, Chris Seaton, Thomas Würthinger, and
Hanspeter Mossenböck, 2014)
- 2023-09-24:
“An Efficient Approach for Accessing C Data Structures from JavaScript”
(Matthias Grimmer, Thomas Würthinger, Andreas Wöß, and Hanspeter Mossenböck,
2014)
- 2023-09-23:
“Denotational Semantics and a Fast Interpreter for jq”
(Michael Färber, 2023)
- 2023-09-12:
“Control Flow Analysis in Scheme”
(Olin Shivers, 1988)
- 2023-08-29:
“ægraphs: Acyclic E-graphs for Efficient Optimization in a Production
Compiler”
(Chris Fallin, 2023)
- 2023-08-26:
“The Sea of Nodes and the HotSpot JIT”
(Cliff Click, 2020)
- 2023-07-25:
“A computability perspective on self-modifying programs”
(Guillaume Bonfante, Jean-Yves Marion, and Daniel Reynaud, 2009)
- 2023-05-29:
“Tree-sitter - a new parsing system for programming tools”
(Max Brunsfeld, 2018)
- 2023-04-09:
“Using Datalog with Binary Decision Diagrams for Program Analysis”
(John Whaley, Dzintars Avots, Michael Carbin, and Monica S. Lam, 2005)
- 2023-03-09:
“Java Generics are Turing Complete”
(Radu Grigore, 2016)
- 2023-03-09:
“
mov
is Turing-complete”
(Stephen Dolan, 2013)
- 2023-03-09:
“Control-Flow Bending: On the Effectiveness of Control-Flow Integrity”
(Nicholas Carlini, Antonio Barresi, Mathias Payer, David Wagner, and Thomas R.
Gross, 2015)
2022
- 2022-12-30:
“Emscripten: An LLVM-to-JavaScript Compiler”
(Alon Zakai, 2011)
- 2022-11-01:
“Truffle: A Self-Optimizing Runtime System”
(Christian Wimmer and Thomas Würthinger, 2012)
- 2022-11-01:
“Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for
Building a Multipurpose Runtime”
(Thomas Würthinger, 2014)
- 2022-11-01:
“Machine Learning to Ease Understanding of Data Driven Compiler
Optimizations”
(Raphael Mosaner, 2020)
- 2022-11-01:
“Polyglot Code Finder”
(Jan Ehmueller, Alexander Riese, Hendrik Tjabben, Fabio Niephaus, and Robert
Hirschfeld, 2020)
- 2022-10-20:
“MLIR: Scaling Compiler Infrastructure for Domain Specific Computation”
(Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis,
Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and
Oleksandr Zinenko, 2021)
- 2022-07-19:
“miniAdapton: A Minimal Implementation of Incremental Computation in Scheme”
(Dakota Fisher, Mathew Hammer, William Byrd, and Matt Might, 2016)
- 2022-07-19:
“Adapton: Composable, Demand-Driven Incremental Computation”
(Matthew Hammer, Khoo Yit Phang, Michael Hicks, and Jeffrey Foster, 2014)
- 2022-07-18:
“Incremental Computation with Adapton”
(Matthew Hammer, 2015)
- 2022-03-31:
“Practical Second Futamura Projection: Partial Evaluation for
High-Performance Language Interpreters”
(Florian Latifi, 2019)
- 2022-03-31:
”Partial Evaluation of Computation Process—An Approach to a
Compiler-Compiler”
(Yoshihiko Futamura, 1971)
- 2022-03-08:
“LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation”
(Chris Lattner and Vikram Adve, 2004)
- 2022-03-03:
“LLVM 2.0 and Beyond!”
(Chris Lattner, 2007)
- 2022-02-26:
“Not So Fast: Analyzing the Performance of WebAssembly vs. Native Code”
(Abhinav Jangda, Bobby Powers, Emery D. Berger, and Arjun Guha, 2019)
- 2022-02-26:
“Bringing the Web up to Speed with WebAssembly”
(Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael
Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien, 2017)
- 2022-02-19:
“Formal verification of a realistic compiler”
(Xavier Leroy, 2008)
2021
2020
- 2020-12-18:
US 2009/0185789 A1: “Recordng medium, reproduction apparatus, recording
method, reproducing method, program, and integrated circuit”
(Joseph McCrossan, Tomoyuki Okada, and Tomoki Ogawa, 2009), i.e., Presentation
Graphic Stream specification for Blu-ray subtitles
- 2020-12-18:
US 7,912,305 B1: “Method for run-length encoding of a bitmap data stream”
(Dirk Gandolph, Jobst Horentrup, Axel Kochale, Ralf Ostermann, and Hartmut
Peters, 2011), i.e., run-length encoding specification for Blu-ray subtitles
- 2020-11-20:
“Unbounded Spigot Algorithms for the Digits of Pi”
(Jeremy Gibbons, 2005)
- 2020-10-08:
“Enzyme: High-Performance Automatic Differentiation of LLVM”
(William Moses and Valentin Churavy, 2020)
- 2020-10-08:
“CIL: Common MLIR Dialect for C/C++ and Fortran”
(Prashantha NR, Vinay Madhusudan, and Ranjith Kumar, 2020)
- 2020-10-08:
“MLIR Tutorial”
(Mehdi Amini and River Riddle, 2020)
- 2020-10-06:
“llvm-diva – Debug Information Visual Analyzer”
(Phillip Power, 2020)
- 2020-10-06:
“Quickly Finding RISC-V Code Quality Issues with Differential Analysis”
(Luís Marques, 2020)
- 2020-10-05:
“Checked C: Adding Memory Safety to LLVM”
(Mandeep Singh Grang and Katherine Kjee, 2020)
- 2020-10-05:
“Everything I Know About Debugging LLVM”
(Nick Desaulniers, 2020)
- 2020-10-05:
“Undef and Poison: Present and Future”
(Juneyoung Lee, 2020)
- 2020-09-28:
“Improving Flow Analyses via ΓCFA: Abstract Garbage Collection and Counting”
(Matt Might and Olin Shivers, ICFP 2006)
- 2020-09-28:
“Abstracting Abstract Machines”
(David Van Horn and Matt Might, ICFP 2010)
- 2020-09-28:
“Writing an interpreter, CESK-style”
(Matt Might, 2012)
- 2020-09-20:
“Understanding Real-World Concurrency Bugs in Go”
(Tengfei Tu, Xiaoyu Liu, Linhai Song, and Yiying Zhang, ASLLOS 2019)
[dataset]
- 2020-09-11:
“A Correspondence between Continuation Passing Style and Static Single
Assignment Form”
(Richard A. Kelsey, 1995)
- 2020-06-05:
“JavaScript: The First 20 Years”
(Allen Wirfs-Brock and Brendan Eich, 2020)
2019
2018
2016
TODO
- “A catalogue of optimizing transformations”
(Frances E. Allen and John Cocke, 1971)
(mentioned in the Cranelift RFC for e-graphs)
- “21 compilers and 3 orders of magnitude in 60 minutes: a wander through a
weird landscape to the heart of compilation”
(Graydon Hoare, 2019)
[post]
[HN]
- IRs
- Graal
- Again: “Graal IR: An Extensible Declarative Intermediate Representation”
(Gilles Duboscq, Lukas Stadler, Thomas Würthinger, Doug Simon, Christian
Wimmer, and Hanspeter Mössenböck, 2013)
- “One VM to Rule Them All”
(Thomas Würthinger, Christian Wimmer, Andreas Wöß, Lukas Stadler, Gilles
Duboscq, Christian Humer, Gregor Richards, Doug Simon, and Mario Wolczko,
2013)
- “An Intermediate Representation for Speculative Optimizations in a
Dynamic Compiler”
(Gilles Duboscq, Thomas Würthinger, Lukas Stadler, Christian Wimmer, Doug
Simon, and Hanspeter Mössenböck, 2013)
- “A Domain-Specific Language for Building Self-Optimizing AST Interpreters”
(Christian Humer, Christian Wimmer, Christian Wirth, Andreas Wöß, and
Thomas Würthinger, 2014)
- “Applying Futamura Projections to Compose Languages and Tools in GraalVM”
(Christian Humer, 2019)
- GraalVM 2016+
- HotSpot C2 / Sea of nodes
- “From Graphs to Quads: An Intermediate Representation’s Journey”
(Cliff Click, 1993)
- “Global Code Motion / Global Value Numbering”
(Cliff Click, 1995)
- “A Simple Graph-Based Intermediate Representation”
(Cliff Click and Michael Paleczny, 1995)
- “Combining Analyses, Combining Optimizations”
Thesis (Cliff Click, 1995)
- “Combining Analyses, Combining Optimizations“
(Cliff Click and Keith D. Cooper, 1995)
- “The Java HotSpot™ Server Compiler”
(Michael Paleczny, Christopher Vick, and Cliff Click, 2001)
- Haskell / G-machine
- “Implementing lazy functional languages on stock hardware: the Spineless
Tagless G-machine”
(Simon Peyton Jones, 1992)
- “The Implementation of Functional Programming Languages”
(Simon Peyton Jones, 1987)
- “Implementing Functional Languages: a tutorial”
(Simon Peyton Jones and David R Lester, 2000)
- “The Spineless Tagless G-machine, naturally”
(Jon Mountjoy, 1998)
- “Type classes in Haskell”
(Cordelia Hall, Kevin Hammond, Simon Peyton Jones, and Philip Wadler,
1994)
- “State in Haskell”
(John Launchbury and Simon Peyton Jones, 1995)
- “A transformation-based optimiser for Haskell”
(Simon Peyton Jones and André L.M. Santos, 1998)
- “C–: a portable assembly language that supports garbage collection”
(Simon Peyton Jones, Norman Ramsey, and Fermin Reig, 1999)
- “Beyond Functional Programming: The Verse Programming Language”
(Simon Peyton Jones, 2022)
- “Graph IRs for Impure Higher-Order Languages: Making Aggressive
Optimizations Affordable with Precise Effect Dependencies”
(Oliver Bračevac, Guannan Wei, Songlin Jia, Supun Abeysinghe, Yuxuan Jiang,
Yuyan Bao, and Tiark Rompf, OOPSLA 2023)
- SSA
- “Efficiently computing static single assignment form and the control
dependence graph” (Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N.
Wegman, and F. Kenneth Zadeck, 1991) describes a translation algorithm,
which converts programs into SSA and minimizes phi-functions (cited by
Kelsey 1995).
- What is Array SSA? From Wikipedia: Wikipedia:
“IBM’s open source adaptive Java virtual machine, Jikes RVM, uses extended
Array SSA, an extension of SSA that allows analysis of scalars, arrays,
and object fields in a unified framework. Extended Array SSA analysis is
only enabled at the maximum optimization level, which is applied to the
most frequently executed portions of code.”
Mailing list post “Questions about the Array SSA form”
may be relevant.
- CPS
- Control-flow structuring
- E-graphs
- BDDs
- Regular expressions
- Parsing
- Parsing papers by John Aycock,
especially on Generalized LR parsing
- History