Points:
Most compilers have a single abstraction level to interface with the system: LLVM IR is roughly “C with vectors” and the JVM provides an “object-oriented type system with a garbage collector” abstraction. Many LLVM frontends have their own IR for domain-specific optimizations.
MLIR solves this by
These principles are often at the expense of another.
Machine learning frameworks and compiler frontends similarly don’t share infrastructure or design principles.
Little builtin, everything customizable (parsimony): It should be able to express a diverse set of abstractions with hard-coding concepts from those into the system.
SSA and regions (parsimony): Nested regions are a first-class concept, which allows direct representation of language abstractions.
Maintain higher-level semantics (progressivity): Only lose structure once no longer needed, as recovering semantics once lowered is fragile. Different levels of abstraction and concepts can be mixed and progressively lowered.
Declaration and validation (parsimony and traceability): Common transformations should be implementable declaratively as machine-analyzable rewrite rules. A compiler infrastructure is only as good as the transformations it supports.
Source location tracking (traceability): Operation provenance, including source provenance and transformations, should be traceable. For example, essential for certification of cryptographic protocols.
Textual representation: Fully reflects in-memory representation. User-definable syntax.
Operations: Everything is modeled as ops. User-extensible. Declarative syntax based on LLVM TableGen. Consists of an opcode, zero or more typed operands and results in SSA form, attributes, regions, successor blocks, and location information. Passes treat unknown ops conservatively. Verifiers in ops enforce op invariants.
Attributes: Compile-time information about ops. Op instances have attribute values.
Location information: Processed and propagated throughout the system. Compact. Extensible and may refer to existing location-tracking systems; e.g., AST nodes, file-line-column address, DWARF, etc.
Regions and blocks: Regions provide the nesting mechanism in MLIR: ops may contain regions, each of which has a list of blocks in a CFG, that each contain ops. Region semantics specified by the op. Blocks end with a terminator op, which may have successor blocks. Uses a functional form of SSA, instead of phi nodes, where each block has typed arguments, that terminators pass values to. The region entry block’s values are defined by the enclosing op’s semantics. The explicit graph design and op extensibility is reminiscent of sea of nodes.
Value dominance and visibility: Ops can only use values that are in scope
according to SSA dominance, nesting, and semantic restrictions. Region-based
visibility is by lexical region definitions. Ops may be isolated from above,
such as func.func
(std.func
in paper), so values defined outside may not be
referenced, allowing parallel compilation, because no use-def chains cross the
isolation barriers.
Symbols and symbol tables: Ops can have an attached symbol table, which associates string names to IR objects and can be used prior to definition.
Dialects: Logical grouping of ops, attributes, and types. Intermixed for progressive lowering, reuse, extensibility, and flexibility.
Type system: Every value is typed, by the producing op or block argument. User-extensible and may refer to foreign type systems. Strict equality without coercions. Supports non-dependent types, including trivial, parametric, function, sum, and product types. Dependent types are possible, but would be opaque to the IR.
Functions and modules: Usually structured into functions and modules, which
are implemented as ops. A module defines top-level constructs, including
functions, and does not transfer control flow. A function has a single region
and is compatible with func.call
and func.return
(std
dialect in paper).
TensorFlow graphs: Models TensorFlow dataflow graphs using the same infrastructure, analysis, and transformation capabilities. Most graph-level transformations are expressible in MLIR for both TensorFlow models and LLVM IR.
Polyhedral code generation: The affine
dialect is a simplified polyhedral
representation for accelerators. It operates on a structured multi-dimensional
type for all memory accesses and does not alias. Loops are represented as
affine.for
with invariant bounds and have static control flow. Inside regions
use affine.load
and affine.store
to restrict indexing to surrounding
iterators. Maintaining high-level loop structure obviates raising.
Fortran IR: flang frontend models high-level semantics and optimizations, such as advanced loop optimizations, array copy elimination, call specialization, and devirtualization, with MLIR. This reuses basic infrastructure and dialects such as OpenMP.
Domain-specific compilers: Rewrite patterns are dynamically extensible at runtime, e.g., for hardware drivers.